The document discusses an upcoming meetup on data warehousing for beginners hosted by the Prague Data Management Meetup group. It provides context on the group and past meetup topics. It then covers various data management concepts like the data lifecycle, types of data architectures including data warehousing, differences between data lakes and data warehouses, and modern approaches to data integration.
The presentation compares Data Lakes with classical DWHs. Topics like schema-on-read, schema-on-write, security, JSON, data modeling, data integration are covered.
This document provides an overview of OLAP cubes and multidimensional databases. It discusses key concepts such as star schemas, dimensions and hierarchies, cube aggregation and operators like roll-up and drill-down. It also compares the relational and multidimensional models, highlighting how multidimensional databases allow for intuitive analysis and fast retrieval of large datasets by predefining dimensional perspectives.
Prague data management meetup #30 2019-10-04Martin Bém
This document summarizes the agenda for the Prague Data Management Meetup on April 10, 2019. The meetup will feature a presentation from Jeff Pollock on next generation data integration patterns. The meetup series discusses topics related to data management, acquisition, storage, integration, analytics, and usage. It is an open professional group that has been running since 2015.
This document discusses multidimensional databases and provides comparisons to relational databases. It describes how multidimensional databases are optimized for data warehousing and online analytical processing (OLAP) applications. Key aspects covered include dimensional modeling using star and snowflake schemas, data storage in cubes with dimensions and members, and performance benefits of multidimensional databases for interactive analysis of large datasets to support decision making.
The document discusses 4th generation data warehousing and analytics. It describes integrating both structured and unstructured data from internal and external sources using a data supply framework. This framework includes an operational data store, data warehouse, analytics data store, and data marts to support strategic, tactical and operational decision making. It also discusses using descriptive and predictive analytics from big data and measuring the outcomes and benefits.
The document outlines steps to build a data warehouse including gathering business requirements, bringing together data to build an operational data store and staging area, defining a dimensional data model at the appropriate granularity level with slowly changing dimensions, building aggregate navigation from the model, creating reports that allow for drill down, drill across and filtering using the aggregate navigation model, and iterating the process quickly.
Star ,Snow and Fact-Constullation Schemas??Abdul Aslam
This document compares and contrasts star schema, snowflake schema, and fact constellation schema. It defines each schema and discusses their key differences. Star schema has a single table for each dimension, while snowflake schema normalizes dimensions into multiple tables. Fact constellation allows dimension tables to be shared between multiple fact tables, modeling interrelated subjects. Performance is typically better with star schema, while snowflake schema reduces data redundancy at the cost of increased complexity.
The presentation compares Data Lakes with classical DWHs. Topics like schema-on-read, schema-on-write, security, JSON, data modeling, data integration are covered.
This document provides an overview of OLAP cubes and multidimensional databases. It discusses key concepts such as star schemas, dimensions and hierarchies, cube aggregation and operators like roll-up and drill-down. It also compares the relational and multidimensional models, highlighting how multidimensional databases allow for intuitive analysis and fast retrieval of large datasets by predefining dimensional perspectives.
Prague data management meetup #30 2019-10-04Martin Bém
This document summarizes the agenda for the Prague Data Management Meetup on April 10, 2019. The meetup will feature a presentation from Jeff Pollock on next generation data integration patterns. The meetup series discusses topics related to data management, acquisition, storage, integration, analytics, and usage. It is an open professional group that has been running since 2015.
This document discusses multidimensional databases and provides comparisons to relational databases. It describes how multidimensional databases are optimized for data warehousing and online analytical processing (OLAP) applications. Key aspects covered include dimensional modeling using star and snowflake schemas, data storage in cubes with dimensions and members, and performance benefits of multidimensional databases for interactive analysis of large datasets to support decision making.
The document discusses 4th generation data warehousing and analytics. It describes integrating both structured and unstructured data from internal and external sources using a data supply framework. This framework includes an operational data store, data warehouse, analytics data store, and data marts to support strategic, tactical and operational decision making. It also discusses using descriptive and predictive analytics from big data and measuring the outcomes and benefits.
The document outlines steps to build a data warehouse including gathering business requirements, bringing together data to build an operational data store and staging area, defining a dimensional data model at the appropriate granularity level with slowly changing dimensions, building aggregate navigation from the model, creating reports that allow for drill down, drill across and filtering using the aggregate navigation model, and iterating the process quickly.
Star ,Snow and Fact-Constullation Schemas??Abdul Aslam
This document compares and contrasts star schema, snowflake schema, and fact constellation schema. It defines each schema and discusses their key differences. Star schema has a single table for each dimension, while snowflake schema normalizes dimensions into multiple tables. Fact constellation allows dimension tables to be shared between multiple fact tables, modeling interrelated subjects. Performance is typically better with star schema, while snowflake schema reduces data redundancy at the cost of increased complexity.
A column oriented database, rather a columnar database is a DBMS (Database Management System) that stores data in columns instead of rows. A columnar database aims to efficiently write and read data to and from hard disk storage to speed up the time to execute a query. A column-store is a physical concept. Here, I primarily focus on what a columnar database is, how it works, its advantages, disadvantages and applications at current times. In due course, the top three market selling columnar databases are discussed with their features. Thus, it is seen that, columnar database is an emerging concept which has high prospect in coming future.
Prague data management meetup 2017-01-23Martin Bém
The document discusses the components of a data warehouse, including:
- Data stores such as the data warehouse itself, data marts, operational data stores, and big data platforms.
- Data integration tools for extracting, transforming, and loading data from various sources.
- Access tools for querying, reporting, visualization, and advanced analytics.
- Metadata for technical, business, and transformation documentation.
- Administration and management functions like operations, security, and quality assurance.
- Development tools for modeling, ETL design, and testing.
The document discusses various techniques for data warehousing and online analytical processing (OLAP), including constructing data warehouses, star schemas, materialized views, data cubes, and data mining. Specifically, it describes how a data warehouse can be used to integrate data from multiple sources and support complex OLAP queries run against historical data. It provides examples of star schemas, materialized views, data cubes, and market basket analysis to find frequent itemsets.
The document discusses multidimensional databases and data warehousing. It describes multidimensional databases as optimized for data warehousing and online analytical processing to enable interactive analysis of large amounts of data for decision making. It discusses key concepts like data cubes, dimensions, measures, and common data warehouse schemas including star schema, snowflake schema, and fact constellations.
Business intelligence (BI) provides timely insights into business performance by analyzing operational data. It extracts raw data, transforms it, and loads it into data warehouses and data marts for analysis. The BI architecture includes extraction, transformation, and loading (ETL) processes; data warehouses for unified enterprise data storage; and data marts tailored for specific business needs like reporting and analysis. Common BI platforms are IBM Cognos, SAP BO, Oracle BI, and Microsoft SQL Server.
Designing high performance datawarehouseUday Kothari
Just when the world of “Data 1.0” showed some signs of maturing; the “Outside In” driven demands seem to have already initiated some the disruptive changes to the data landscape. Parallel growth in volume, velocity and variety of data coupled with incessant war on finding newer insights and value from data has posed a Big Question: Is Your Data Warehouse Relevant?
In short, the surrounding changes happening real time is the new “Data 2.0”. It is characterized by feeding the ever hungry minds with sharper insights whether it is related to regulation, finance, corporate action, risk management or purely aimed at improving operational efficiencies. The source in this new “Data 2.0” has to be commensurate to the outside in demands from customers, regulators, stakeholders and business users; and hence, you would need a high relformance (relevance + performance) data warehouse which will be relevant to your business eco-system and will have the power to scale exponentially.
We starts this webinar by giving the audiences a sneak preview of what happened in the Data 1.0 world & which characteristics are shaping the new Data 2.0 world. It then delves deep on the challenges that growing data volumes have posed to the Data warehouse teams. It also presents the audiences some of the practical and proven methodologies to address these performance challenges. Finally, in the end it will highlight some of the thought provoking ways to turbo charge your data warehouse related initiatives by leveraging some of the newer technologies like Hadoop. Overall, the webinar will educate audiences with building high performance and relevant data warehouses which is capable of meeting the newer demands while significantly driving down the total cost of ownership.
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data Spain
Martyn Jones presented on Big Data, analytics and 4th generation data warehousing. He discussed the importance of a comprehensive data supply framework to obtain, integrate and analyze data from various sources to provide strategic, tactical and operational decision making support. He described moving beyond traditional data warehousing to a 4th generation approach that leverages big data and analytics to gain insights and measure outcomes.
Solution architecture for big data projects
solution architecture,big data,hadoop,hive,hbase,impala,spark,apache,cassandra,SAP HANA,Cognos big insights
A data warehouse is a structured repository of historic data that is subject oriented, integrated, time variant, and non-volatile. It collects and integrates data from multiple legacy systems to provide accessible information to businesses and help manage knowledge so business partners can gain wisdom. A data warehouse loads data from operational data stores and operational systems of record to provide aggregated and multidimensional reports for long term trend analysis and enterprise-wide reporting.
This document discusses different types of schemas used in multidimensional databases and data warehouses. It describes star schemas, snowflake schemas, and fact constellation schemas. A star schema contains one fact table connected to multiple dimension tables. A snowflake schema is similar but with some normalized dimension tables. A fact constellation schema contains multiple fact tables that can share dimension tables. The document provides examples and comparisons of each schema type.
Column-oriented databases like Infobright Community Edition are well-suited for data warehousing due to their high data compression rates and efficient handling of analytic queries. Infobright uses data packs, knowledge nodes, and an optimizer to retrieve only necessary column data without decompressing entire files. It achieves industry-leading compression of 10-40x by optimizing algorithms for each data type and stores metadata to resolve complex queries without traditional row-based indexing. By integrating with MySQL, Infobright leverages existing connectivity and provides a low-cost option for data warehousing and business intelligence.
This document provides an overview of data warehousing concepts including dimensional modeling, online analytical processing (OLAP), and indexing techniques. It discusses the evolution of data warehousing, definitions of data warehouses, architectures, and common applications. Dimensional modeling concepts such as star schemas, snowflake schemas, and slowly changing dimensions are explained. The presentation concludes with references for further reading.
Data mining and data warehousing have evolved since the 1960s due to increases in data collection and storage. Data mining automates the extraction of patterns and knowledge from large databases. It uses predictive and descriptive models like classification, clustering, and association rule mining. The data mining process involves problem definition, data preparation, model building, evaluation, and deployment. Data warehouses integrate data from multiple sources for analysis and decision making. They are large, subject-oriented databases designed for querying and analysis rather than transactions. Data warehousing addresses the need to consolidate organizational data spread across various locations and systems.
The document discusses the structure of data warehouses and data marts. It defines a data warehouse as a subject-oriented, integrated collection of time-variant data used for decision making. A data warehouse can be classified as lite, deluxe, or supreme based on its scope and technologies used. The document also defines a data mart as a scaled-down version of a data warehouse that can be sourced from a data warehouse or developed independently to meet specific user needs. It provides examples of how to implement and structure both data warehouses and data marts.
The document discusses data warehousing and OLAP technology for data mining. It defines a data warehouse as a subject-oriented, integrated, time-variant, and nonvolatile collection of data to support management decision making. It describes how a data warehouse uses a multi-dimensional data model with dimensions and measures. It also discusses efficient computation of data cubes, OLAP operations, and further developments in data cube technology like discovery-driven and multi-feature cubes to support data mining applications from information processing to analytical processing and knowledge discovery.
The document provides an overview of data warehousing and OLAP technology. It defines a data warehouse as a subject-oriented, integrated collection of historical data used for analysis and decision making. It describes key properties of data warehouses including being subject-oriented, integrated, time-variant, and non-volatile. It also discusses dimensional modeling, data cubes, and OLAP for analyzing aggregated data.
IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.George Joseph
SAP HANA is an in-memory database system that stores data in main memory rather than on disk for faster access. It uses a column-oriented approach to optimize analytical queries. SAP HANA can scale from small single-server installations to very large clusters and cloud deployments. Its massively parallel processing architecture and in-memory analytics capabilities enable real-time processing of large datasets.
Presentation "Trends in Records, Document and Enterprise Content Management" at the S.E.R. Conference, Vizegrad, Hungary, 28th September 2004 by Dr. Ulrich Kampffmeyer, PROJECT CONSULT. (c) CopyRight and Authorship Rights: Dr. Ulrich Kampffmeyer, PROJECT CONSULT Unternehmensberatung GmbH, hamburg, 2003-2004. http://www.PROJECT-CONSULT.com
This document defines key concepts in data warehousing including data warehouses, data marts, and ETL (extract, transform, load). It states that a data warehouse is a non-volatile collection of integrated data from multiple sources used to support management decision making. A data mart contains a single subject area of data. ETL is the process of extracting data from source systems, transforming it, and loading it into a data warehouse or data mart.
As a follow-on to the presentation "Building an Effective Data Warehouse Architecture", this presentation will explain exactly what Big Data is and its benefits, including use cases. We will discuss how Hadoop, the cloud and massively parallel processing (MPP) is changing the way data warehouses are being built. We will talk about hybrid architectures that combine on-premise data with data in the cloud as well as relational data and non-relational (unstructured) data. We will look at the benefits of MPP over SMP and how to integrate data from Internet of Things (IoT) devices. You will learn what a modern data warehouse should look like and how the role of a Data Lake and Hadoop fit in. In the end you will have guidance on the best solution for your data warehouse going forward.
Building an Effective Data Warehouse ArchitectureJames Serra
Why use a data warehouse? What is the best methodology to use when creating a data warehouse? Should I use a normalized or dimensional approach? What is the difference between the Kimball and Inmon methodologies? Does the new Tabular model in SQL Server 2012 change things? What is the difference between a data warehouse and a data mart? Is there hardware that is optimized for a data warehouse? What if I have a ton of data? During this session James will help you to answer these questions.
A column oriented database, rather a columnar database is a DBMS (Database Management System) that stores data in columns instead of rows. A columnar database aims to efficiently write and read data to and from hard disk storage to speed up the time to execute a query. A column-store is a physical concept. Here, I primarily focus on what a columnar database is, how it works, its advantages, disadvantages and applications at current times. In due course, the top three market selling columnar databases are discussed with their features. Thus, it is seen that, columnar database is an emerging concept which has high prospect in coming future.
Prague data management meetup 2017-01-23Martin Bém
The document discusses the components of a data warehouse, including:
- Data stores such as the data warehouse itself, data marts, operational data stores, and big data platforms.
- Data integration tools for extracting, transforming, and loading data from various sources.
- Access tools for querying, reporting, visualization, and advanced analytics.
- Metadata for technical, business, and transformation documentation.
- Administration and management functions like operations, security, and quality assurance.
- Development tools for modeling, ETL design, and testing.
The document discusses various techniques for data warehousing and online analytical processing (OLAP), including constructing data warehouses, star schemas, materialized views, data cubes, and data mining. Specifically, it describes how a data warehouse can be used to integrate data from multiple sources and support complex OLAP queries run against historical data. It provides examples of star schemas, materialized views, data cubes, and market basket analysis to find frequent itemsets.
The document discusses multidimensional databases and data warehousing. It describes multidimensional databases as optimized for data warehousing and online analytical processing to enable interactive analysis of large amounts of data for decision making. It discusses key concepts like data cubes, dimensions, measures, and common data warehouse schemas including star schema, snowflake schema, and fact constellations.
Business intelligence (BI) provides timely insights into business performance by analyzing operational data. It extracts raw data, transforms it, and loads it into data warehouses and data marts for analysis. The BI architecture includes extraction, transformation, and loading (ETL) processes; data warehouses for unified enterprise data storage; and data marts tailored for specific business needs like reporting and analysis. Common BI platforms are IBM Cognos, SAP BO, Oracle BI, and Microsoft SQL Server.
Designing high performance datawarehouseUday Kothari
Just when the world of “Data 1.0” showed some signs of maturing; the “Outside In” driven demands seem to have already initiated some the disruptive changes to the data landscape. Parallel growth in volume, velocity and variety of data coupled with incessant war on finding newer insights and value from data has posed a Big Question: Is Your Data Warehouse Relevant?
In short, the surrounding changes happening real time is the new “Data 2.0”. It is characterized by feeding the ever hungry minds with sharper insights whether it is related to regulation, finance, corporate action, risk management or purely aimed at improving operational efficiencies. The source in this new “Data 2.0” has to be commensurate to the outside in demands from customers, regulators, stakeholders and business users; and hence, you would need a high relformance (relevance + performance) data warehouse which will be relevant to your business eco-system and will have the power to scale exponentially.
We starts this webinar by giving the audiences a sneak preview of what happened in the Data 1.0 world & which characteristics are shaping the new Data 2.0 world. It then delves deep on the challenges that growing data volumes have posed to the Data warehouse teams. It also presents the audiences some of the practical and proven methodologies to address these performance challenges. Finally, in the end it will highlight some of the thought provoking ways to turbo charge your data warehouse related initiatives by leveraging some of the newer technologies like Hadoop. Overall, the webinar will educate audiences with building high performance and relevant data warehouses which is capable of meeting the newer demands while significantly driving down the total cost of ownership.
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data Spain
Martyn Jones presented on Big Data, analytics and 4th generation data warehousing. He discussed the importance of a comprehensive data supply framework to obtain, integrate and analyze data from various sources to provide strategic, tactical and operational decision making support. He described moving beyond traditional data warehousing to a 4th generation approach that leverages big data and analytics to gain insights and measure outcomes.
Solution architecture for big data projects
solution architecture,big data,hadoop,hive,hbase,impala,spark,apache,cassandra,SAP HANA,Cognos big insights
A data warehouse is a structured repository of historic data that is subject oriented, integrated, time variant, and non-volatile. It collects and integrates data from multiple legacy systems to provide accessible information to businesses and help manage knowledge so business partners can gain wisdom. A data warehouse loads data from operational data stores and operational systems of record to provide aggregated and multidimensional reports for long term trend analysis and enterprise-wide reporting.
This document discusses different types of schemas used in multidimensional databases and data warehouses. It describes star schemas, snowflake schemas, and fact constellation schemas. A star schema contains one fact table connected to multiple dimension tables. A snowflake schema is similar but with some normalized dimension tables. A fact constellation schema contains multiple fact tables that can share dimension tables. The document provides examples and comparisons of each schema type.
Column-oriented databases like Infobright Community Edition are well-suited for data warehousing due to their high data compression rates and efficient handling of analytic queries. Infobright uses data packs, knowledge nodes, and an optimizer to retrieve only necessary column data without decompressing entire files. It achieves industry-leading compression of 10-40x by optimizing algorithms for each data type and stores metadata to resolve complex queries without traditional row-based indexing. By integrating with MySQL, Infobright leverages existing connectivity and provides a low-cost option for data warehousing and business intelligence.
This document provides an overview of data warehousing concepts including dimensional modeling, online analytical processing (OLAP), and indexing techniques. It discusses the evolution of data warehousing, definitions of data warehouses, architectures, and common applications. Dimensional modeling concepts such as star schemas, snowflake schemas, and slowly changing dimensions are explained. The presentation concludes with references for further reading.
Data mining and data warehousing have evolved since the 1960s due to increases in data collection and storage. Data mining automates the extraction of patterns and knowledge from large databases. It uses predictive and descriptive models like classification, clustering, and association rule mining. The data mining process involves problem definition, data preparation, model building, evaluation, and deployment. Data warehouses integrate data from multiple sources for analysis and decision making. They are large, subject-oriented databases designed for querying and analysis rather than transactions. Data warehousing addresses the need to consolidate organizational data spread across various locations and systems.
The document discusses the structure of data warehouses and data marts. It defines a data warehouse as a subject-oriented, integrated collection of time-variant data used for decision making. A data warehouse can be classified as lite, deluxe, or supreme based on its scope and technologies used. The document also defines a data mart as a scaled-down version of a data warehouse that can be sourced from a data warehouse or developed independently to meet specific user needs. It provides examples of how to implement and structure both data warehouses and data marts.
The document discusses data warehousing and OLAP technology for data mining. It defines a data warehouse as a subject-oriented, integrated, time-variant, and nonvolatile collection of data to support management decision making. It describes how a data warehouse uses a multi-dimensional data model with dimensions and measures. It also discusses efficient computation of data cubes, OLAP operations, and further developments in data cube technology like discovery-driven and multi-feature cubes to support data mining applications from information processing to analytical processing and knowledge discovery.
The document provides an overview of data warehousing and OLAP technology. It defines a data warehouse as a subject-oriented, integrated collection of historical data used for analysis and decision making. It describes key properties of data warehouses including being subject-oriented, integrated, time-variant, and non-volatile. It also discusses dimensional modeling, data cubes, and OLAP for analyzing aggregated data.
IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.George Joseph
SAP HANA is an in-memory database system that stores data in main memory rather than on disk for faster access. It uses a column-oriented approach to optimize analytical queries. SAP HANA can scale from small single-server installations to very large clusters and cloud deployments. Its massively parallel processing architecture and in-memory analytics capabilities enable real-time processing of large datasets.
Presentation "Trends in Records, Document and Enterprise Content Management" at the S.E.R. Conference, Vizegrad, Hungary, 28th September 2004 by Dr. Ulrich Kampffmeyer, PROJECT CONSULT. (c) CopyRight and Authorship Rights: Dr. Ulrich Kampffmeyer, PROJECT CONSULT Unternehmensberatung GmbH, hamburg, 2003-2004. http://www.PROJECT-CONSULT.com
This document defines key concepts in data warehousing including data warehouses, data marts, and ETL (extract, transform, load). It states that a data warehouse is a non-volatile collection of integrated data from multiple sources used to support management decision making. A data mart contains a single subject area of data. ETL is the process of extracting data from source systems, transforming it, and loading it into a data warehouse or data mart.
As a follow-on to the presentation "Building an Effective Data Warehouse Architecture", this presentation will explain exactly what Big Data is and its benefits, including use cases. We will discuss how Hadoop, the cloud and massively parallel processing (MPP) is changing the way data warehouses are being built. We will talk about hybrid architectures that combine on-premise data with data in the cloud as well as relational data and non-relational (unstructured) data. We will look at the benefits of MPP over SMP and how to integrate data from Internet of Things (IoT) devices. You will learn what a modern data warehouse should look like and how the role of a Data Lake and Hadoop fit in. In the end you will have guidance on the best solution for your data warehouse going forward.
Building an Effective Data Warehouse ArchitectureJames Serra
Why use a data warehouse? What is the best methodology to use when creating a data warehouse? Should I use a normalized or dimensional approach? What is the difference between the Kimball and Inmon methodologies? Does the new Tabular model in SQL Server 2012 change things? What is the difference between a data warehouse and a data mart? Is there hardware that is optimized for a data warehouse? What if I have a ton of data? During this session James will help you to answer these questions.
Designing Scalable Data Warehouse Using MySQLVenu Anuganti
The document discusses designing scalable data warehouses using MySQL. It covers topics like the role of MySQL in data warehousing and analytics, typical data warehouse architectures, scaling out MySQL, and limitations of MySQL for large datasets or as a scalable warehouse solution. Real-time analytics are also discussed, noting the challenges of performance and scalability for near real-time analytics.
Role of MySQL in Data Analytics, WarehousingVenu Anuganti
The document discusses the role of MySQL in data analytics and data warehousing. It describes how MySQL is widely used by many companies for online transaction processing (OLTP) and is the de facto standard for developers. While MySQL can be used for small data warehousing and analytics tasks, the document recommends using column-oriented databases with compression for large datasets due to MySQL's limitations in scalability for data warehousing. It provides tips on optimizing MySQL for analytics workloads and discusses using OLAP cubes and real-time analytics for near real-time insights.
The document discusses OLAP cubes and data warehousing. It defines OLAP as online analytical processing used to analyze aggregated data in data warehouses. Key concepts covered include star schemas, dimensions and facts, cube operations like roll-up and drill-down, and different OLAP architectures like MOLAP and ROLAP that use multidimensional or relational storage respectively.
Virtualisation de données : Enjeux, Usages & BénéficesDenodo
Watch full webinar here: https://bit.ly/3oah4ng
Gartner a récemment qualifié la Data Virtualisation comme étant une pièce maitresse des architectures d’intégration de données.
Découvrez :
- Les bénéfices d’une plateforme de virtualisation de données
- La multiplication des usages : Lakehouse, Data Science, Big Data, Data Service & IoT
- La création d’une vue unifiée de votre patrimoine de données sans transiger sur la performance
- La construction d’une architecture d’intégration Agile des données : on-premise, dans le cloud ou hybride
The document provides an overview of SAP's Business Intelligence (BI) solution, including its key capabilities and components. It discusses how SAP BI integrates data warehousing, a BI platform, business intelligence tools, and pre-configured business content to deliver actionable insights. It also addresses how SAP BI and SAP NetWeaver help enable information integration, collaboration, and universal data access across the enterprise.
The document outlines the agenda for a data warehousing training course. The agenda covers topics such as data warehouse structure and modeling, extract transform load (ETL) processes, dimensional modeling, aggregation, online analytical processing (OLAP), and data marts. Time is allocated to discuss loading, refreshing, and querying the data warehouse.
MammothDB is the first inexpensive enterprise analytics database, offered in the cloud or on-premises.
It's pointless to have big, or even medium sized data, if you don't have the ability to easily use and understand that data. We're making enterprise analytics accessible to every company in the world, particularly the under-served 88% of global companies that don't have enterprise analytics/business intelligence today.
Prague data management meetup 2017-02-28Martin Bém
The document discusses an operational data store (ODS) that was implemented to integrate data from two banks, Velká česká banka and Nová česká banka, after a transaction integration, using APIs, ETL workflows, and data transformations to populate the ODS with consolidated customer, account, and transaction data from both banks for operational reporting. It also provides details on the types of data domains integrated into the ODS and growth in API usage over time as more systems accessed the shared ODS.
Power BI for Big Data and the New Look of Big Data SolutionsJames Serra
New features in Power BI give it enterprise tools, but that does not mean it automatically creates an enterprise solution. In this talk we will cover these new features (composite models, aggregations tables, dataflow) as well as Azure Data Lake Store Gen2, and describe the use cases and products of an individual, departmental, and enterprise big data solution. We will also talk about why a data warehouse and cubes still should be part of an enterprise solution, and how a data lake should be organized.
Data Warehouse Design and Best PracticesIvo Andreev
A data warehouse is a database designed for query and analysis rather than for transaction processing. An appropriate design leads to scalable, balanced and flexible architecture that is capable to meet both present and long-term future needs. This session covers a comparison of the main data warehouse architectures together with best practices for the logical and physical design that support staging, load and querying.
This document discusses several topics related to SQL Server Analysis Services (SSAS) including:
- Best practices for SSAS design including dimensions, measures, partitioning and security.
- New features in the upcoming "Denali" release including the BI semantic model and PowerPivot integration.
- Performance tuning techniques such as distinct count optimization and scale out queries.
- Tools for analyzing SSAS queries and cube design best practices.
- Design considerations for large enterprise solutions including partitioning, hardware sizing and concurrency management.
The document provides an overview of business intelligence, data warehousing, and ETL concepts. It defines business intelligence as using technologies to analyze data and support decision making. A data warehouse stores historical data from transaction systems and supports querying and analysis for insights. ETL is the process of extracting data from sources, transforming it, and loading it into the data warehouse for analysis. The document discusses components of BI systems like the data warehouse, data marts, and dimensional modeling and provides examples of how these concepts work together.
Logical Data Warehouse: How to Build a Virtualized Data Services LayerDataWorks Summit
The document discusses the emergence of logical data warehouses in response to big data. It describes how a logical data warehouse uses virtualization, distributed processing, and other techniques to provide a unified view of data across different repositories like Hadoop, relational databases and NoSQL stores. It also discusses how organizations can optimize resources by offloading analytical workloads from their enterprise data warehouse to Hadoop clusters to reduce costs while still using existing code and applications.
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsMark Rittman
Mark Rittman, founder of Rittman Mead, discusses Oracle's approach to hybrid BI deployments and how it aligns with Gartner's vision of a modern BI platform. He explains how Oracle BI 12c supports both traditional top-down modeling and bottom-up data discovery. It also enables deploying components on-premises or in the cloud for flexibility. Rittman believes the future is bi-modal, with IT enabling self-service analytics alongside centralized governance.
Getting Started with Data Virtualization – What problems DV solvesDenodo
Experts and analysts agree that data virtualization's strategic role in enterprise architecture for increasing agility and flexibility in the delivery of information. In this presentation, you will find how data virtualization enables organizations to access, manage, and integrate data from a wide variety of data sources.
This presentation is part of the Fast Data Strategy Conference, and you can watch the video here goo.gl/IS9RGK.
Similar to Prague data management meetup #31 2020-01-27 (20)
This document discusses trends in data warehousing and analytics. It provides an overview of the evolution of data warehousing from its origins in the 1980s to modern approaches. Key stages discussed include the rise of data marts and ETL in the 1990s-2000s, the emergence of big data and Hadoop in the 2010s, and current approaches like logical data warehousing, data lakes, and machine learning/AI. It also examines ongoing challenges around data volume, complexity, legacy systems, and others.
This document summarizes a blockchain meetup in Prague in October 2018. The agenda included an overview of blockchain technology, platforms, and a question and answer session. Blockchain was defined and examples like Bitcoin and Ethereum were provided. Popular platforms like Hyperledger, Ethereum Enterprise Alliance, and Corda were also listed and criteria for evaluating blockchain platforms was presented. Use cases for identity management and trade on the blockchain were briefly discussed.
Prague data management meetup 2018-03-27Martin Bém
This document discusses different data types and data models. It begins by describing unstructured, semi-structured, and structured data. It then discusses relational and non-relational data models. The document notes that big data can include any of these data types and models. It provides an overview of Microsoft's data management and analytics platform and tools for working with structured, semi-structured, and unstructured data at varying scales. These include offerings like SQL Server, Azure SQL Database, Azure Data Lake Store, Azure Data Lake Analytics, HDInsight and Azure Data Warehouse.
Prague data management meetup 2018-02-27Martin Bém
This document discusses the agenda for the Prague Data Management Meetup on February 27, 2018. The topics included an overview of the meetup group, Gartner's Magic Quadrant for Data Management Solutions for Analytics, and the second part of an introduction to data warehouse modeling (Základy modelování DW #2). The meetup group focuses on topics related to data management, acquisition, storage, integration, analytics, and usage. A history of past meetup topics is also provided.
Prague data management meetup 2017-11-21Martin Bém
The document summarizes an upcoming Prague Data Management Meetup event on Big Data. The event agenda includes a discussion on Big Data architectures, covering topics like ETL vs ELT on Hadoop, Lambda and Kappa architectures, polyglot processing, and the 7 V's of Big Data (Volume, Velocity, Variety, Variability, Veracity, Visualization, and Value). The speaker will be Kuba Augustin, discussing Big Data quickly and wildly.
Prague data management meetup 2017-09-26Martin Bém
This document discusses current trends in data management that were presented at the Prague Data Management Meetup on September 26, 2017. It begins with listing the agenda and history of past meetup events. Then, the main section analyzes trends in data governance, big data, data science, machine learning, artificial intelligence, data lakes, self-service BI, smartphone BI, advanced analytics, collaborative BI, appliances, visual data discovery, data storytelling, augmented analytics, cloud integration, cloud analytics, advanced data platform architectures, internet of things, data warehouse modernization, automation, analytical databases, and data source federation. It concludes with a final joke.
Prague data management meetup 2017-03-28Martin Bém
This document provides an overview of metadata and its role in data warehousing (DW) and business intelligence (BI). It discusses different types of metadata including descriptive, structural, and administrative metadata. Examples of metadata are provided relating to conceptual models, business rules, processes, data structures, transformations and movement. The importance of metadata for context, consolidation, and ensuring truth in data is highlighted. The metadata lifecycle of creating, maintaining, updating, storing, and publishing is also summarized.
Prague data management meetup 2016-11-22Martin Bém
The document discusses Prague Data Management Meetup, an open professional group that meets monthly to discuss topics related to data management. It then provides an agenda and history for past meetup events, covering subjects like data lakes, dark data, self-service BI, and data warehouse modeling. The remainder of the document focuses on data warehouse modeling, including comparisons of operational databases versus data warehouses, different data modeling approaches, and best practices for data warehouse design like using standard naming conventions and domain types.
Prague data management meetup 2016-03-07Martin Bém
The document summarizes an upcoming meetup about data warehousing modeling issues. The meetup will discuss sad stories related to data warehouse architecture, governance, data quality, integration, operations, and data modeling. Some examples of issues that will be discussed include too generic data models, industry data models being misapplied, missing constraints in data models, and copying data structures from source systems 1:1 without normalization. The meetup is part of a regular series organized by a Prague data management group.
Prague data management meetup 2016-01-12 pubMartin Bém
The document summarizes a Prague data management meetup. The agenda included an introduction about the group, and a presentation on the data lake concept. A data lake is defined as a massive, easily accessible data repository for storing big data without dropping attributes below aggregation levels. It aims to retain all attributes without knowing the scope or use of the data in advance. Quotes from industry experts provide perspectives on data lakes being large storage repositories and one of the more controversial ways to manage big data. Key factors for data lakes include metadata, data quality, technology used, value added, costs, security, governance, and data load processes.
Prague data management meetup 2015 11-23Martin Bém
The document summarizes a meetup about dark data. It defines dark data as data that is collected and stored by organizations but not used for insights or decision making. Examples of typical dark data sources are log files, customer information, previous employee data, and old documents. Reasons why dark data grows include legal risks, lost opportunities, and open-ended exposure. Estimates suggest 80-90% of organizational data is dark. Tips to manage dark data include implementing data governance, ongoing data assessment, retention policies, and specifically auditing dark data for security.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfTechgropse Pvt.Ltd.
In this blog post, we'll delve into the intersection of AI and app development in Saudi Arabia, focusing on the food delivery sector. We'll explore how AI is revolutionizing the way Saudi consumers order food, how restaurants manage their operations, and how delivery partners navigate the bustling streets of cities like Riyadh, Jeddah, and Dammam. Through real-world case studies, we'll showcase how leading Saudi food delivery apps are leveraging AI to redefine convenience, personalization, and efficiency.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Things to Consider When Choosing a Website Developer for your Website | FODUUFODUU
Choosing the right website developer is crucial for your business. This article covers essential factors to consider, including experience, portfolio, technical skills, communication, pricing, reputation & reviews, cost and budget considerations and post-launch support. Make an informed decision to ensure your website meets your business goals.
2. PRAGUE DATA MANAGEMENT MEETUP (PDM MEETUP)
– Open professional group
– Based on www.meetup.com
– Everyone is welcomed
– There are no bad topics, only bad speakers☺
– You can show anything to others
– Operational since September 2015
– Sponsored by ADASTRA
DATA MANAGEMENT
DATA ACQUISITION
DATA STORING
DATA INTEGRATION
DATA ANALYTICS
DATA USAGE
PDM MEETUP 2
3. MEETUP HISTORY
# Date Topics
1 10. 9. 2015 Data Management
2 14. 10. 2015 Data Lake
3 23. 11. 2015 Dark Data (without Dark Energy and Dark Force)
4 12. 1. 2016 Data Lake
5 7. 3. 2016 Sad Stories About DW/BI Modeling (sad only)
6 23. 3. 2016 Self-service BI Street Battle
7 27. 4. 2016 Let's explore the new Microsoft PowerBI!
8 22. 9. 2016 Data Management pro začátečníky
Data Management for Beginners
9 17. 10. 2016 Small Big Data
10 22. 11. 2016 Základy modelování DW/BI
DW/BI Modeling Basics
11 23.1.2017 Komponenty datových skladů
Data Warehouse Components
12 28.2.2017 Operational Data Store
13 28.3.2017 Metadata v DW/BI
DW/BI Metadata
# Date Topics
14 25.4.2017 Jak se stát DW/BI konzultantem
Be a DW/BI Consultant
15 16.5.2017 SQL
16 29.5.2017 From IoT to AI: Applications of time series data
17 26.9.2017 Aktuální trendy v data managementu
Actual trends in data management
18 24.10.2017 Datové platformy na technologiích Oracle
Data platforms based on Oracle
19 21.11.2017 Big Data rychle a zběsile / Big Data Fast and Furious
20 30.1.2018 Jak se staví velké datové sklady
How to build huge data warehouse
21 27.2.2018 Základy modelování DW/BI #2
DW/BI Modeling Basics
22 27.3.2018 Big Data: How to deal with sensorics (floating) data easily
23 17.4.2018 DW/BIaaS
24 22.5.2018 Be a Consultant / Jak se stát konzultantem
25 19.6.2018 Building AI-Powered Retail Store
26 17.9.2018 Information Management 101
27 23.10.2018 Blockchain
28 29.1.2019 DW & BI trendy v roce 2019 / DW & BI Trends in 2019
29 26.3.2019 Data Warehouse Automation
30 10.4.2019 Next Gen Data Integration Patterns With Jeff Pollock
31 26.1.2020 Data Warehousing pro začátečníky
Data Warehousing for beginners
11. BRIEF DATA MANAGEMENT HISTORY
Modern Age
Cloud
Automation
Logical Data Warehouse
Extended Data Warehouse
Data Lake
Polyglot Architecture
Kappa / Lambda
Databus
Data Pipeline
Real-time Data Integration
Big Data ETL
Open Source Analytics
Big Data Analytics
Self-service BI & ETL
Data Science
Machine Learning & AI
Hadoop without Hadoop
Stream Analytics
All data Analytics
Data Management Platform
Autonomous Technologies
Decoupled Compute & Storage
Serverless
Prehistory
Controlled Chaos
Best Practice Awaking
Manual Scripting
Primeval Relational Analytics
1985 - 1995
Antiquity
Titans: Kimball vs. Inmon
Maturing Best practices
Enterprise Data Warehouse
ETL
OLAP
Reference Data Management
Classic Relational Analytics
1995 – 2005 2005 - 2015
Middle Age
Traditional Data Warehouse
Hub-and-Spoke Architecture
Data Governance
Master Data Management
Metadata-Driven Development
ELT
Data Vault
Data Mining
DW Appliance
Columnar DB
In-memory DB
Hadoop Stack Dawn
Unstructured Data Analytics
2015 - 2025
Future?
2025 - ∞
12. Data Landscape
Core Backends
Social Networks
Web Data
External Data
Sensors Communication
Master Data
Data Analytics
Devices
Reporting
Business
Intelligence
Data
Visualization
13. Data
Warehousing
Dark
Data
Data Landscape
Core Backends
Social Networks
Web Data
External Data
Sensors Communication
Master Data
Data Analytics
Devices
Reporting
Business
Intelligence
Data
Discovery
Segmentation
Data
Visualization
Star
Schema
Operational
Data
Snowflake
Schema
Unused Data
OLAP
Enterprise
Core Data
Planning
14. Data Landscape
Big
Data
Data
Warehousing
Dark
DataCore Backends
Social Networks
Web Data
External Data
Sensors Communication
Master Data
Data Analytics
Devices
Reporting
Business
Intelligence
Data
Discovery
Data Science
Machine
Learning
Segmentation
Network
Analytics
Documents
Voice
Geo DataPredictive
Analytics
Graph
Log
Semi-structured
Data
Visualization
Biometrics
Image
Automated
Decisions
Star
Schema
MessagesCold
Data
Operational
Data
DW
Archive
Snowflake
Schema
Unused Data
OLAP
Enterprise
Core Data
Planning
Recommendations
15. Data Landscape
Deep
Data
Fast
Data
Big
Data
Data
Warehousing
Dark
DataCore Backends
Social Networks
Web Data
External Data
Sensors Communication
Master Data
Data Analytics
Devices
Reporting
Business
Intelligence
Data
Discovery
Data Science
Machine
Learning
Segmentation
Network
Analytics
Documents
Voice
Geo DataPredictive
Analytics
Graph
Log
Semi-structured
Voice
Data
Visualization
Biometrics
Biometrics
Real-time
Vision
Stream
Processing
Sensor
Processing
Image
Automated
Decisions
Events
Star
Schema
Mined Data
Messages
MessagesCold
Data
Operational
Data
DW
Archive
Snowflake
Schema
Unused Data
OLAP
Enterprise
Core Data
Planning
Recommendations
16. CLASSICAL DATA WAREHOUSE
– Key data platform for decades but no more
– Data system used for reporting and data analysis, and
is considered a core component of business
intelligence. DWs are central repositories of integrated
data from one or more disparate sources.
– A large amount of information from a company stored
on a computer and used for making business
decisions
– Old mature concept
– Core Features
– Database (usually RDBMS)
– Subject Orientation
– Data Integration
– History
– Structure Stability
– Batch processing & significant data latencies
– DW, DWH, MIS, ADS, ADW, EDW, DP
Data Warehouse
Data
Source
Data
Acquistion
Data
Integration
Data
Staging Data Repository
Reporting &
Other Data
Usage
Analytics
Data
Source
17. Data
Staging
Area
Ralph Kimball
Data Warehouse Bus (DW)
Bottom-Up
Conformed Data Marts
(Kimball’s Data Warehouse)
Conformed
Dimensions
Business Transformation
CLASSICAL DATA WAREHOUSE ARCHITECTURES (HUB-AND-SPOKE)
Data
Sources
Data
Marts
RDBMS
RDBMS
Reporting
Data Apps
Bill Inmon
Enterprise Data Warehouse (EDW)
Top-Down
Dan Linstedt
Data Vault (DV)
Top-Down
Technical
Transformation
Technical
Transformation
Technical
Transformation
Business
Transformation
Business
Transformation
Data
Sources
Data
Sources
Data
Marts
Data
Staging
Area
Data
Staging
Area
Data
Warehouse
Data
Vault
Business
Vault
Business
Transformation
RDBMS
RDBMS
Reporting
Data Apps
RDBMS
RDBMS
Reporting
Data Apps
Data
Marts
18. DW Logical Layers
L0: Stage Area
L1: Relational
Area
L1: Consolidation
Area
L2: Data Mart
Area
– Data Mart Area
– L2
– User Access Layer
– Consolidation Area
– Consolidated L1
– Common aggregates for L2
– Cleansed and consolidated data
– Relational Area
– Detailed L1
– Consistent, integrated, subject oriented data,
universal data structure, historical data,
maximal detail
– System of record
– Foundation Layer
– Stage Area
– Direct copy of source systems
Extracts
Reports
Note: Consolidated and Detailed L1 can
share same data structures
General DWH
Staging Area ODS
Presentation Layer
Datamart Area (Dependent Datamarts)
Source systems
Customer
DB
ETL
Other...S4S3S2S1
Analytic tools
(SPSS, SAS..)
OLAP
S1 S2 S3 S4 Other
S1 Ostatní...S4S3S2
ETL
Materialization
OLAP?
ETL
ETL
ETL
ETL
ETL
ETL
ETL
ETL
ETL
ETL
CDB
ETL
EAI
ReportingReporting Reporting Reporting
Relational Area
ETL
Application Application
Materialization
Application Application
ETL
19. DATA INTEGRATION PATTERNS
Mediator
Load
Extract
Extract
Load Transform
Transform Load
Extract Transform
Source Target
TEL
ELT
ETL
API Call API LogicData API
CDC Change Capture LoadExtract ReplicationTransport
Pub/Sub SubscriptionPublisher Broker
ETLT Extract
Load Transform
LoadTransform
Data Pipeline Data Pipeline
21. Complexity
Query
Engine
Modern Data Architectures
Hub-and-Spoke
Data Warehouse
Polyglot
Data Federation
Data Virtualization
Logical Data Warehouse
Lambda
Kappa
Databus
Speed Layer
Pipeline Manager
Batch Layer
Object Storage
Data
Integration
Data
AcquistionData
Sources
Data
Sources
Data
Sources
Data
Ingest
Messaging
CDC
Bulk Copy
Files
Data Extractor
Data
Ingest
Messaging
CDC
Bulk Copy
Files
Data Extractor
Data
Warehouse
RDBMS
RDBMS
Reporting
Data Apps
Data
Marts
Analytics
Serving
Layer
Data Lake
REST
SQL
Pub/Sub
Data Warehouse RDBMS
Reporting
RDBMS
Data Apps
Data
Marts
Analytics
Serving
Layer
Data Lake
REST
SQL
Pub/Sub
Data
Integration
Data Warehouse RDBMS
Reporting
RDBMS
Data Apps
Data
Marts
Analytics
Data
Integration
Data Warehouse RDBMS
Reporting
RDBMS
Data Apps
Data
Marts Analytics
Data
Integration
Speed Layer
Pipeline Manager
Data
Acquistion
Data
Sources
21
22. 22
– DW vs. DL VS. XDW/DP
Traditional Data Warehouse (DW) Data Lake (DL) Extended Data Warehouse (XDW) / Data
Platform (DP)
Data Structured Structured & Semi-Structured & Unstructured Structured & Semi-Structured & Unstructured
Data Processing Processed Raw Processed & Raw
Data Schema Schema-on-write Schema-on-read Schema-on-write & Schema-on-read
Data Model Relational Object-based Relational & Object-based
Data History Hierarchically archived No hierarchy Hierarchically archived & No hierarchy
Agility Fixed configuration Reconfigured anytime as needed Fixed configuration
Reconfigured anytime as needed
Security Mature Maturing Mature
Primary Users Data analysists &
Business professionals
Data Scientists Data analysists & Business professionals &
Data scientists
Technology RDBMS NoSQL DBMS
Hadoop
Other distributed storages
RDBMS
NoSQL DBMS
Hadoop
Other distributed storages
Agility Low High Medium
Added Value Medium Medium High
Cost High Low Medium
Operation After full release From start From start
23. DataOps vs. Adastra Information Management
Data
Ingest
{}
Data
Integration
Data
Management
Architecture
Data
Model
Database
Data
Repository
Deployment
Data
Usage
E-R ModelHub & Spoke
Kappa / Databus
Graph Data Model
Key Management
Data Discovery
Data Science
On-premise
Cloud
Hybrid Cloud
Multi-Cloud
Data Warehouse
Data Mart
Sandbox
Business Intelligence
Reporting
Machine Learning
Data Lake
RDBMS
In-memory
Document Store
Multidimensional DB
Graph DBMS
Columnar DBMS
Object Store
NoSQL
Multidimensional
Model
Data Archive
Time Variance
Data Latency
Audit
Date Tiering
Data Retention
Data SecurityAutomation
Orchestration
Aggregation
Reconciliation
ETL/ELT
Cleansing
Standardization
Data Loading
Data Replication
Change Data Capture
Manual Inputs
Stream Processing
Legacy
Lambda
Operational
Data Store
Snowflake Schema
Big Data Fabric
Star Schema Metadata
Data Catalog
Data Governance
Data Adhoc
Quering
Data Literacy
Reference & Master
Data Management
TCO Management
Governance
Polyglot
Key-Value
Column Family
Data API
File RepositoryDistributed
File System
Data Pipeline
Master Data
Repository
DataOps
Containers
SLA Management
24. BUSINES PRIORITIES VS. CLASSICAL DATA WAREHOUES
Grow revenue & profit
Improve CX
Improve products and services
360 degree view
Digital transformation
Accelerate responses to business and
market changes
Real-time data-driven decisions
Faster predictive insights
Smarter intelligent business
Structured static data only
Melting with data growth
Business demand exceeds IT capacities & IT budgets
Data siloed cross multiple platforms
Growing operational overhead
Missing real-time insights
Unscalable
Limited advanced analytics
Really expensive TCO
Outdated governance and security
25. CZ banka A
Data Warehouse
Data Warehouse
Core
Reporting
&
Other
Data
Usage
Analytics
Operational Data Store
Data
Source
Data
Acquistion
Data
Integration
Data
Staging
ODS Data
Repository
Data
Source
ODS
Data
API
Process
Process
Data Marts
Data
Synchro
Data Quality
Master
Data
Repository
DQ
Data
API
Data
Quality
Engine
External Calculation Engines
Process
Process
25
26. CZ Banka B
Data Warehouse
Data
Warehouse
Core
Reporting
&
Other
Data
Usage
Analytics
Operational Data Store
Data
Source
Data
Acquistion
Data
Integration
Data
Staging
ODS Data
Repository
Data
Source
ODS
API
Process
Process
Data Marts
Data
Synchro
Data Quality
Engine
Master Data
Repository
Reference
Data
Repostory
Reference
User
Interface
28. Data StoreApplication ServerWeb Server
Pentaho Data
Integration
(Web Console)
Adastra
Workflow
GUI
Adastra
Ref Books
GUI
Adastra
Worflow
Middleware
Adastra
Ref Books
Middleware
Pentaho Data
Integration
(Carte)
Pentaho Data
Integration
(Repository)
Adastra
Worfklow
for RDBMS
Database
Scheduler
Adastra
Ref Books
Store
Adastra
ELT
SAP
PowerDesigner
Adastra
Code Generator
External Components
Adastra
Data Model
Runtime
Design Time
Liberty Bank
29. IKEA: Data Warehouse as a Managed Service
Data volume
& Processing
Data from 3 countries: CZ/HU/SK
8 stores
2 830 000 customers
Purchases from 2007 till now
105 000 000 transactions
620 000 000 transaction items
295 000 000 email events
1TB-total size of database
Daily load takes about 5(2+3)hours
DWH server
Configuration
Virtual Server 8vCPU, 32GB RAM, 1.5TB HDD
Adastra ETL Framework & MS SSIS
Cloud4Com
VPN Cloud-IKEA
MS SQL Server 2017 Standard Edition
30. Duo Bank of Canada: Data Warehouse as a Managed Service
Data Warehose as a Managed Service in Cloud by Adastra CA
Best-shoring and support by Adastra BG.
Payment Card Industry Data Security Standard (PCI DSS) compliance
„We created Duo Bank to do things differently.
With a customer focused mindset, we’re
committed to changing the way businesses
connect with their customers by reimagining
and recreating value-driven financial products
and services. At the heart of everything we
do is our commitment to innovation,
customer experience, efficiency and
delivering exceptional value.
31. Data LakeOn-premise Data Sources
Landing & Staging Area Raw Data Area Data Mart Area
Business Intelligence
Microsoft Power BIData Loader Azure Data Lake Storage
Azure Data Factory
Azure SQL Database
Azure Data Catalog
Data LakeOn-premise Data Sources
Landing & Staging Area Raw Data Area Data Mart Area
Business Intelligence
Amazon Insight
Data Loader Amazon S3
Amazon Data Pipeline
Amazon RDS
Amazon Glue
Amazon Athena
Integrace On-premise řešení s analytikou v Cloudu
32.
33. FK_FXRXFACT__FXRX
FK_ACC__GLACC
FK_ACCPROVFACT__ACC
FK_GLACCTRN__GLACC
FK_GLACC__GLACCTP
FK_GLACC__CCY
FK_GLACC__ACCSTAT
FK_FXRX_FX_CCY
FX
FK_FXRX__FXRXTP
FK_FXRX__CCY
FK_ACCTRN_MERCH_PT
MERCH
FK_ACCTRN_CNTPTT_ACC
CNTPTT
FK_ACCTRN_CNTPT_PTBANKCONT
CNTPT
FK_ACCTRN__TRNPURP
FK_ACCTRN__PTBANKCONT
FK_ACCTRN__POS
FK_ACCTRN__CRDB
FK_ACCTRN__CNL
FK_ACCTRN__CCY
FK_ACCTRN__CARD
FK_ACCTRN__ACCTRNTP
FK_ACCTRN__ACCTRNSTAT
FK_ACCTRN__ACC
FK_ACCRWAFACT__RWATP
FK_ACCRWAFACT__ACCSTDTP
FK_ACCPROVFACT__ACCSTDTP
FK_ACCINTRS__PERFRQ
FK_ACCINTRS__INTRSRXTP
FK_ACCINTRS__INTRSBASRX
FK_ACCINTRS__ACCINTRSTP
FK_ACCFTRTRN_MERCH_PT
MERCH
FK_ACCFTRTRN__POS
FK_ACCFTRTRN__CNL
FK_ACCFTRTRN__BLOCTP
FK_ACCFTRTRN__BLOCSTAT
FK_ACCFTRTRN__AUTHSTAT
FK_ACCFTRTRN__ACC
FK_ACCBALFACT__ACCSTDTP
FK_ACC__CCY
FK_ACC__ACCTP
FK_ACC__ACCSTAT
<<Ref Table>>
Account Status
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Account Status Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
Account
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Account Key
Account Type Key
Account Status Key
Currency Key
GL Account Key
POS Key
Account Number
Account Name
IBAN
Open Date
Activation Date
Close Date
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
DATE
DATE
DATE
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<fk2>
<fk1>
<fk3>
<fk4>
<ak>
<pk,ak,fk4>
<<Ref Table>>
Account Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Account Type Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<Ref Table>>
Currency
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Currency Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<<Ref Table>>
Accounting Standard Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Accounting Standard Type Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
Account Balance Fact
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Snap Date
Account Key
Accounting Standard Type Key
Balance
Overdraft Balance
Reserve Balance
Planned Balance
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
DATE
INTEGER
INTEGER
NUMBER(19,3)
NUMBER(19,3)
NUMBER(19,3)
NUMBER(19,3)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk,ak>
<pk,fk2>
<fk1>
<ak>
<pk,ak,fk2>
<<ADS Table>>
Account Future Transaction
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<DW Column>>
Account Future Transaction Key
Transaction Date
Account key
Blocation Type Key
Blocking Status Key
Authorization Status Key
Merchant Party Key
POS Key
Channel Key
Blocking Reference Number
Blocking Amount
Expiry Blocking Date
Blocking Description
Transaction Value Date
Transaction Entry Date
Transaction Entry DateTime
Source Identifier
Source System Identifier
Update Datetime
Update Process Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Effective Date
Source Update DateTime
INTEGER
DATE
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
NUMBER(19,3)
DATE
VARCHAR2(255 CHAR)
DATE
DATE
DATE
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
DATE
VARCHAR2(255)
INTEGER
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<pk,ak>
<fk1>
<fk4>
<fk3>
<fk2>
<fk7>
<fk6>
<fk5>
<ak>
<pk,ak,fk1,fk6,fk7>
<<Ref Table>>
Authorization Status
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Authorization Status Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<Ref Table>>
Blocking Status
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Blocking Status Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<Ref Table>>
Blocation Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Blocation Type Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<Ref Table>>
Channel
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Channel Key
Identifier
Description
Local Description
Source Identifier
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
POS
(<ABDM_DWH_CLIENT_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
POS Key
Party Key
POS Type Key
Organisation Unit Key
POS Identifier
POS Description
Opening Hours
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Source Update DateTime
Update Effective Date
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<fk3>
<fk2>
<fk1>
<ak>
<pk,ak,fk1,fk3>
<<ADS Table>>
Party
(<ABDM_DWH_CLIENT_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Party Key
Unified Party Key
Party Type Key
Party Status Key
Business Sector Key
Legal Form Key
Country Key
Language Key
Housing Type Key
Gender Key
Personal Identifier
Company Identifier
P Code
First Name
First Name Latin
Family Name
Family Name Latin
Middle Name
Business Name
Business Name Latin
Short Name
Short Name Latin
Salutation
Birth Date
Resident Flag
Bankruptcy Flag
Start Date
End Date
Source System Identifier
Source Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
DATE
INTEGER
INTEGER
DATE
DATE
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<fk8>
<fk7>
<fk6>
<fk1>
<fk5>
<fk2>
<fk4>
<fk3>
<fk9>
<pk,ak,fk8>
<ak>
<<Ref Table>>
Account Interest Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Account Interest Type Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
Account Interest
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Account Interest Key
Account Key
Account Interest Type Key
Interest Rate Type Key
Interest Base Rate Key
Period Frequency Key
Interest Rate
Interest Limit amount
Interest Start Date
Interest End Date
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
NUMBER(10,6)
NUMBER(19,3)
DATE
DATE
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<fk5>
<fk1>
<fk3>
<fk2>
<fk4>
<ak>
<pk,ak,fk5>
<<Ref Table>>
Interest Base Rate
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Interest Base Rate Key
Period Frequency Key
Identifier
Description
Local Description
Market Flag
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<fk>
<ak>
<ak>
<<Ref Table>>
Interest Rate Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Interest Rate Type Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<Ref Table>>
Period Frequency
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Period Frequency Key
Period Code
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
Account Provision Fact
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Snap Date
Account Key
Accounting Standard Type Key
Provision Total Balance
Provision Principal Balance
Provision Interest Balance
Provision Fee Balance
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
DATE
INTEGER
INTEGER
NUMBER(19,3)
NUMBER(19,3)
NUMBER(19,3)
NUMBER(19,3)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk,ak>
<pk,fk2>
<fk1>
<ak>
<pk,ak,fk2>
<<ADS Table>>
Account RWA Fact
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Snap Date
Account Key
Accounting Standard Type Key
RWA Type Key
RWA Exposure
RWA Rate
RWA Balance
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
DATE
INTEGER
INTEGER
INTEGER
NUMBER(19,3)
NUMBER(10,6)
NUMBER(19,3)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk,ak>
<pk,fk3>
<fk1>
<fk2>
<ak>
<pk,ak,fk3>
<<Ref Table>>
RWA Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
RWA Type Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
Account Transaction
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<DW Column>>
Account Transaction Key
Transaction Date
Account Key
Card Transaction Location Key
Card Key
Party Bank Contact Key
Counterparty Account Key
Counterparty Bank Contact Key
GL Account Key
Credit/Debit Key
Account Transaction Type Key
Account Transaction Status Key
Transaction Purpose Key
Currency Key
Channel Key
POS Key
Merchant Party Key
Transaction Reference Number
Transaction Batch Identifier
Transaction Amount
Transaction Amount Local Currency
Transaction Amount Account Currency
Transaction Account FX Rate
Transaction Value Date
Transaction Entry Date
Transaction Entry DateTime
Client Internal Transaction Flag
Cancel Flag
Reversal Flag
Message For Recipient
Message For Sender
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
DATE
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
NUMBER(19,3)
NUMBER(19,3)
NUMBER(19,3)
VARCHAR2(255 CHAR)
DATE
DATE
DATE
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<pk,ak>
<fk1>
<fk14>
<fk4>
<fk9>
<fk12>
<fk11>
<fk7>
<fk3>
<fk2>
<fk10>
<fk5>
<fk6>
<fk8>
<fk13>
<ak>
<pk,ak,fk1,fk4,f...>
<<Ref Table>>
Account Transaction Status
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Account Transaction Status Key
Account Transaction Business Status Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<Ref Table>>
Account Transaction Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Account Transaction Type Key
Account Transaction Category Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<fk>
<ak>
<ak>
<<ADS Table>>
Card
(<ABDM_DWH_PRODUCT_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Card Key
Product Key
Card Type Key
Card Status Key
View Card Number
Card Identifier
Card Name
Activation Date
Expired Date
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
DATE
DATE
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<ak>
<pk,ak>
<<Ref Table>>
Credit/Debit
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Credit/Debit Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
Party Bank Contact
(<ABDM_DWH_CLIENT_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Party Bank Contact Key
Party Key
Bank Contact Type Key
Institution Party Key
Bank Contact Number
Specific Symbol
Variable Symbol
Constant Symbol
IBAN
Bank Account Name
Swift Fee
Bank Identification Code
Valid Flag
Source System Identifier
Source Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<fk3>
<fk1>
<fk2>
<pk,ak,fk2,fk3>
<ak>
<<Ref Table>>
Transaction Purpose
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Transaction Purpose Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
FX Rate : 1
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
FX Rate Key
Currency Key
FX Currency Key
FX Rate Type Key
FX Scale
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<fk1>
<fk3>
<fk2>
<ak>
<pk,ak>
<<Ref Table>>
FX Rate Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
FX Rate Type Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
GL Account
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
GL Account Key
Account Status Key
Currency Key
GL Account Type Key
GL Account Number
GL Account Group
Description
Party Account Flag
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<fk1>
<fk2>
<fk3>
<ak>
<pk,ak>
<<Ref Table>>
GL Account Type
(<ABDM_DWH_REF_TAB_ADS>)
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
GL Account Type Key
Identifier
Description
Local Description
Source ID
Source System ID
Delete Flag
Insert Datetime
Insert Process Identifier
Update Datetime
Update Effective Date
Update Process Identifier
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
VARCHAR2(255)
DATE
DATE
VARCHAR2(255)
<pk>
<ak>
<ak>
<<ADS Table>>
GL Account Transaction
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
GL Account Transaction Key
Transaction Date
GL Account Key
Cost Code Key
Cost Centre Key
Cost Project Key
Invoice Transaction ID
Invoice number
Invoice Document Identifier
Debit/Credit Key
GL Transaction Date
GL Transaction Amount
GL Transaction Amount Local Currency
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
DATE
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
DATE
NUMBER(19,3)
NUMBER(19,3)
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<pk,ak>
<fk>
<ak>
<pk,ak,fk>
<<ADS Table>>
FX Rate : 2
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
FX Rate Key
Currency Key
FX Currency Key
FX Rate Type Key
FX Scale
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
INTEGER
INTEGER
INTEGER
INTEGER
INTEGER
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk>
<fk1>
<fk3>
<fk2>
<ak>
<pk,ak>
<<ADS Table>>
FX Rate Fact
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<DW Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
<<Audit Column>>
Snap Date
FX Rate Key
Rate Buy
Rate
Rate Sell
Value Date
Source Identifier
Source System Identifier
Delete Flag
Insert Process Identifier
Insert Datetime
Update Process Identifier
Update Datetime
Update Effective Date
Source Update DateTime
DATE
INTEGER
NUMBER(10,6)
NUMBER(10,6)
NUMBER(10,6)
DATE
VARCHAR2(255 CHAR)
VARCHAR2(255 CHAR)
INTEGER
VARCHAR2(255)
DATE
VARCHAR2(255)
DATE
DATE
DATE
<pk,ak>
<pk,fk>
<ak>
<pk,ak,fk>
35. Records
Primary groups
Candidate groups
John Smith
null
John Smith
null
Jane Smith
420347213
Jane Watson
420347213
J Smith
420347213
J Smith
null
Jane Watson
420347213
John Smith
095252433
John Smith
095252433
John Smith
095242434
John Smith
095242434
Janette Smith
null
Secondary groups
?
Unique
Kandidátské skupiny (ilustrace)
36. Velké řešení => Komplexní Governance nutná
Concepts
Vision & Mission
Guiding Principles
Organization & Roles
Business Rules
Activities
Scope
Benefits & Goals
Components
Data Architecture
Data Quality
Data Integration
Operations
Security
RDM & MDM
Metadata
Data Platform & BI
Tools
CASE
Enteprise Metadata Repository
Data Quality Tools
QA Framework
Workflow & Orchestration
IDE
Audit Log
Resource Management
RDBMS
NoSQL
Hadoop
Integration tools
Monitoring
Source Code Repository
Testing Tools
Others
Why What How
38. Shrnutí a doporučení pro zavedení DataOps
DataOps je naprostá nutnost, protože stále selhávají 2 ze 3 analytických projektů
(Data Kitchen)
DataOps je data management framework zaměřený na zlepšení a zrychlení
komunikace, datové integrace a automizace datových toků
Reálná zavedení DataOps potvrzují smysluplný přínos ve více než 80% případů
(výzkum Research 451)
Bez DataOps nelze uvažovat o efektivní daty řízené kultuře (data-driven culture)
DataOps nejde koupit (ikdyž existují „DataOps nástroje“), ale musí se vybudovat
jako integrální součást organizace (třeba pomocí „DataOps nástrojů“)
DataOps klade velký důraz na kontinuální dodávku hodnoty pomocí datová
analytiky (Value Pipeline) a její průběžné rychle inovaci (Inovation Pipeline)
sanboxing a self-service
DevOps Agile
Data
Management
Lean
Manufacturing
DataOps
Innovation Pipeline
Value
Pipeline
Value
Prototyping
Verification
Standardization
Analytics
Domain
Quality
Data
Datové a
logické testy
Version Control
System
Branch & Merge Více prostředí
Parametrizace
zpracování
Práce beze
strachu a
hrdinství
Datová architektura DataOps Metriky Komunikace
39. 39
Zrychlování a zkvalitňování datových skladů pomocí samočinných nástrojů a
procesů
Soustředění se více na data místo rutinních věcí nějakých souvisejícími s daty
Automatizace vývoje (Development)
Vyšší produktivita vývojářů => rychlejší dodávky
Konzistence postupů a standardů => lépe udržovatelná řešení
Automatizace usnadňuje použití agilních přístupů
Standardizovaný testovací proces zajišťuje kontinuální Quality Assurance
Snadnější vývoj a prototypování umožnují snadnější reakce na změny
Snadná impact analýza změn datového skladu díky metadatům
Základní typy
Model Driven
Data Driven
Automatizace provozu (Operations)
Nasazovací proces je zjednodušený a postavený na balíčcích omezující ruční práci
Dokumentace se generuje automaticky a je konzistentní s aktuálním releasem
Snadná impact analýza dopadů provozních změn na datový sklad a koncové uživatele
Enterprise rozšíření zajišťující delší životnost řešení
Robustnější standardizované procesy zajištující stabilnější a kvalitnější provoz
Lepší bezpečnost díky Quality Assurance, standardům a postupům
data Warehousing Automatizace (DWA) Adastra
Ajilius
AnalytiX DS
Attunity Compose (Biready)
BI Builder
BI builders
biGenius
Birst
Centennium Automation Tool
Datavault Builder
DDM Studio
Dimodelo
Effektor
Gamma Systems
Halo BI
Insource Data Academy
Instant Business Intelligence
(SeETL)
Kalido
LeapFrogBI
Optimal ODE
Quipu
TimeXtender
Varigence
WhereScape
45. Současné výzvy Data Managementu
Zdroje:
451 Research, DataOps: the foundation for agility, security and transformational change, March 2019
Data Kitchen, Washington DC DataOps Meetup, 2019
87% of data science projects
never get to production.
Data analytics investment
up, but “data driven”
organizations down 37% to
31%
60% of all data analytic
projects fail
79% of data projects have
too many errors
47. 2005 & 2019 Side by Side
Business Intelligence
Data Sources
ERP CRM External Systems Internal Systems
Analytics
Reporting OLAP Data Mining
Data Integration
ETL EAI
Industry
Know-How
Database
Data Warehouse Data Mart Operational Data Store Staging Area
End User Access
Intranet EIS & Monitoring Analytics Tools Others
Management
Technical
Expertise
Data Quality
Metadata
Analytics
Department
Customer
Care
Others
Enrichment & Consolidation & Event Processing
MDM DQ Reference Data Management Complex Event Proccesing Message Requeueing DMP
Data Acquistion & Data Ingest
Speed Processing Batch Processing Change Data Capture Direct Data Extractor Bulk Copy
Publisher/Subscriber
Data Sources
Relational Data Semi-Structured Data Unstructured Data Streams Events Signals User Files
Analytics
Statistics OLAP Advanced Analytics Artificial Intelligence Machine Learning Stream Analytics Geospatial
Analytics
Data Integration
ETL ELT Big Data ELT Data API Microservices Self-service ETL Real-time Integration
Governance
Data Model
Data Strategy
Data Delivery
Architecture
Methodology
Standards
Metadata
Management
Data
Catalogue
Data Lineage
Business
Glossary
Documentation
Information
Lifecycle
Testing
Strategy
BICC
Data Store
Data Warehouse Data Mart Data Lake ODS NoSQL Sandbox Event Hub Big Data Platform In-memory Columnar
Data Access
Data Connector Query Engine Data API Web GUI Application Integration Mobile Applications Indexing & Search
Business Intelligence
Reports Ad-hoc Query Dashboard Data Visualization Data Discovery Self-Service BI Mobile BI Data Science GUI
Business Users & Applications Development
&
Operations
Monitoring
Alerts &
Notification
Scheduling
Workflow
Security
Resource
Management
Release
Management
High Availbility
Backup &
Restore
Data Purge
Automation
Metadata
Driven
Development
48. Truth in data
Primary data
Primary data
(another system)
Secondary data
Consolidated data
…Noise generator
Truth
Independent truth in data does not exist
Truth depends on Business and Data architect definition
49. Other Topics
– DW vs. Business Intelligence
– DW vs. Operational Data Store
– DW vs. Master Data Management
– DW vs. Big Data
– Metadata
– Data Lineage
– Data Governance
– Implementation
– Data Modelling
– Mapping
– Parellel Processing
– Metadata Driven Development
– Information Delivery
– KPIs, Metrics, Dimensions
– Data Analytics
– Semantic Data Layer
– Self-service BI
– Data Virtualization / Data Federation
– Operations
– Automation
– Workflow
– Disaster Recovery
– Technologies
49
50. Data Warehousing & Business Intelligence
Data Platform
A Data Warehouse, a Data Lake, a Big Data Platform or
Data anything for storing and managing data for analytics.
Data Integration
Processes combining and transforming data from different
sources and providing consolidated structures of data in
motion and data at rest.
Data Analytics
Processes of inspecting, transforming, modeling data in
motion and data at rest. Data Science is included.
Data Governance
A framework to ensure the appropriate behavior in the
valuation, creation, integration, storing, consumption and
control of data and analytics.
DataOps
An automated methodology to improve the quality and
reduce the cycle time of data analytics based on Agile,
DevOps and Lean Manufacturing
Reporting & Business Intelligence
Presenting data to end-users in a way that is
understandable and actionable.
Technical Solutions Business Solutions
General
Augmented Analytics
Data Discovery
Data Storytelling
Data for Planning
External Data Enrichment
Self-service BI
Finance
Budgeting & Planning
Business Performance Reporting
Profitability Analytics
Risk Management
Fraud Detection
Loan Classification
Portfolio Reporting
Risk Based Pricing
Risk Modeling
CRM & Marketing
Campaign Monitoring
Churn Prevention
Customer Lifetime Value
Customer Segmentation
Geolocation Analytics
Know Your Customer
Network Analytics
Omnichannel Communication
Sentiment Analytics
Sales
Product Propensity
Sales Network Performance
Up-sell & X-sell
Others
HR Attrition
Predictive Maintenance
Quality Assurance