SlideShare a Scribd company logo
Komponenty datových skladů
#11
23. 1. 2017
Prague Data Management Meetup
Agenda
• Prague Data Management Meetup
• Komponenty datových skladů
Prague Data Management Meetup
Data Management
Získávaní dat
Ukládání dat
Zpracování dat
Interpretace dat
Použití dat
• Otevřená profesionální zájmová
skupina
• Každý je vítán (ať už v pasivní
nebo aktivní roli)
• Témat není nikdy dost
• Snaha o pravidelné měsíční
setkávání
• Fungujeme od září 2015
Historie
Datum Téma
10. 9. 2015 Data Management
14. 10. 2015 Data Lake
23. 11. 2015 Dark Data (without Dark Energy and Dark Force)
12. 1. 2016 Data Lake (znova)
7. 3. 2016 Sad Stories About DW Modeling (sad stories only)
23. 3. 2016 Self-service BI Street Battle
27. 4. 2016 Let's explore the new Microsoft PowerBI!
22. 9. 2016 Data Management pro začátečníky
17. 10. 2016 Small Big Data
22. 11. 2016 Základy modelování DW
23.1.2017 Komponenty datových skladů
Data Management
Big Data
Data Warehouse
• Data integration from various data
sources in requested quality and
time
• Publish and share consistent
information for purposes and users
• Flexible and effective ad-hoc
reporting a analysis
• Features
• Subject Orientation
• Data Integration
• Low variability
• History
• Main perspectives
• Data Integration
• Data Storage
• Data Access
• New oppurtunities
• Complex Event Processing in real time
• Application Integration
• Real-time decision support
• Operational Data Store
• Integration with Big Data Platform
EDW, DW, DSS, ADS, ADW, DP…
Operational Database vs. Data Warehouse
Characteristic Operational Database Data Warehouse
Currency Current Historical
Details level Individual Individual and summary
Orientation Process Subject
Records per request Few Thousands
Normalization level Mostly normalized Normalization relaxed
Update level Highly volatile Mostly refreshed (non volatile)
Data model Relational Relational (star schemas) and multidimensional (data cubes)
Source: Coursera
Adastra Business Intelligence Reference Architecture
9
ODS
Operational
reporting
Enterprise DWH Big Data
Platform
Data Lake
Event
Processing
Semantic
Models
Advanced Analytics
Perceptual / cognitive intelligence
Information Management
Relational / Structured data Unstructured data Streaming
Data Workflow
Orchestration
Data Transformation /
Processing
Data
Management
Event Ingestion
Complex Event
Processing
Notifications
BI / Application
Integration
Machine Learning
In-database Data Mining, R
Recognition of human
interaction and intent
SMP and MPP
In-memory technologies
In-memory Columnar
In-memory technologies Hadoop, NoSQL
Business Intelligence / Data Delivery
Real-time DashboardsDashboards and visualizationsReports Self-service BIMobile BI
IoT Network
Field Gateway
Big data
OLAP
DW Logical Layers
L0: Stage Area
L1: Relational Area
L1: Consolidation
Area
L2: Data Mart Area
• Data Mart Area
• L2
• User Access Layer
• Consolidation Area
• Consolidated L1
• Common aggregates for L2
• Cleansed and consolidated data
• Relational Area
• Detailed L1
• Consistent, integrated, subject oriented data,
universal data structure, historical data, maximal
detail
• System of record
• Foundation Layer
• Stage Area
• Direct copy of source systems
Extracts
Reports
Note: Consolidated and Detailed L1 can
share same data structures
General DWH
Staging Area ODS
Presentation Layer
Datamart Area (Dependent Datamarts)
Source systems
Customer
DB
ETL
Other...S4S3S2S1
Analytic tools
(SPSS, SAS..)
OLAP
S1 S2 S3 S4 Other
S1 Ostatní...S4S3S2
ETL
Materialization
OLAP?
ETL
ETL
ETL
ETL
ETL
ETL
ETL
ETL
ETL
ETL
CDB
ETL
EAI
ReportingReporting Reporting Reporting
Relational Area
ETL
Application Application
Materialization
Application Application
ETL
Data Warehouse Components
Data Stores Access Tools Metadata
Data
Integration
Tools
Administration
and
Management
Development
Tools
Not only technology!
Data Stores
Logical Stores
• Data Warehouse
• Data Mart
• Operational Data Store
• Customer Data Integration
• Product Data Integration
• Data Hub
• Data Lake
• Data Archive
• Big Data Platform
Physical Stores
• RDBMS
• OLAP
• HDFS
• NoSQL
• SMP
• MPP
• Cluster
• Appliance
Data Integration Tools
Custom Scripts
ELT vs. ETL
Real-time
Change Data Capture
Logical Mapping
Physical Mapping
Workflow
Dependencies
Restartability
Error Handling
Scheduling
Events
Monitoring
QA
Testing
Design Patterns
Design Standards
Kappa Architecture
Lambda Architecture
Data Quality
Matching
Data Cleanup
Data Profiling
Data Checks
Data Sources and Targets (ODI Examples)
Apache Derby Apache HDFS Apache Hive Apache HBase Cloudera CDH dBase
HyperSQL
Database Engine
IBM DB2
IBM DB2 for
Linux Unix and
Windows
IBM DB2 for i IBM DB2 for z/OS IBM Informix
IBM Informix
Dynamic Server
(DS)
IBM Informix
Extended Parallel
Server (XPS)
IBM Netezza NPS
IBM Websphere
MQ
Ingres InterBase
ISO Database
Language SQL
(generic SQL-92
database)
Java Message
Service (JMS)
Microsoft Access Microsoft Excel
Microsoft SQL
Server
MySQL Server
Oracle Database Oracle Essbase
Oracle Hyperion
Planning
Oracle Service
Bus
Oracle TimesTen
In-Memory
Database
Paradox Pervasive PSQL PostgreSQL
SAP BW SAP ERP ECC
Teradata
Database
Textové soubory XML soubory JSON soubory
Batch Data Transformation: ETL vs. ELT
Extract Load Transformation
Extract Transformation Load
Data Transformation Model Pattern
Source table
Target table
table
Target table
Filter
SRC
Filter
TRG
Differential
member
(minus or
outer join or
Merge)
Filter
OUT
Lookup tables
Join SRC
Lambda Architecture Kappa Architecture
Oracle Data Integrator
Pentaho Data Integration
Metadata
Technical
metadata
Business
metadata
Static metadata
Dynamic
metadata
Context
metadata
Data model
Transformation
model
Methodology
Design Standards
• Naming conventions
• Models, nodes, layers, schemas
• Entities
• Attributes
• Keys
• Relationships
• Indexes, constraints
• Level of normalization
• Level of convergence
• Key strategy
• Standard attributes
Supplier standards
Customer standards
Third party product
standards
Typical Metadata Backbone
Primary generic
principles of the data
warehouse solution
(goals and objectives)
Logical architecture
document
Physical architecture
document
Environment map
Logical data model
name and design
standard
Data history and
retention strategy
Physical data model
design standard
including partitioning
List of acronyms Business rule document KPI catalogue
Transformation design
standard
Process definition for
analysis, design,
development, testing,
release management
and bug fixing
Templates for analysis,
design, development,
testing, release
management and bug
fixing
Test catalog Developer guideline
Operation guideline
including DR strategy
Quality assurance SLA templates
Information delivery
strategy
Data Mart strategy Data Quality strategy
Business entity mapping
to data model with
examples for end users
Business data domain
document
Security architecture
Access Tools
• Microsoft Excel rulez
• SQL Query Tools
• Enteprise Reporting Tools
• Self-service BI
• Data Discovery
• Data Mining Tools (Weka, Azure
ML..)
• Statistics Tools (SPSS, SAS, R…)
• Information Delivery
• Real-time decision
• Application integration
Oracle Business Intelligence
• On-premise i cloud varianta
• Podpora pro pokročilou analytiku, self-service
vizualizace i Mobile BI
Oracle Big Data Discovery
• Nativní self-service analytika pro Big Data řešení
27
Tableau Desktop Qlik Sense Desktop
Microsoft Power BI Desktop SAS Visual Analytics
Administration and Management
Daily operations Environments Data quality checks
Managing and
updating metadata
Auditing and
reporting data
warehouse usage
and status
Purging data
Replicating, sub-
setting and
distributing data
Backup and
recovery
Data warehouse
storage
management.
Bug tracking and
fixing
DW Stacks
Stack Others
RDBMS
Oracle Database
MySQL
Microsoft SQL Server
Microsoft SQL Server APS
Azure SQL Data Warehouse
Amazon Redshift
HP Vertica
IBM dashDB
IBM DB2
PostgreSQL
SAP HANA
SAP IQ
SAP SQL Anywhere
Teradata Database
ETL/ELT
Oracle Data Integrator
Oracle Golden Gate
MS Integration Services
Azure Data Factory
Clover ETL
IBM InfoSphere DataStage
Informatica PowerCenter
Pentaho Data Integration
SAP Data Services
SAS Data Integration
Talend Data Integration
BI & Analytics
Oracle Big Data Discovery
Oracle Business Intelligence
Oracle Endeca Data Discovery
Oracle Essbase
Oracle R Enterprise
Azure Machine Learning
MS Analysis Services
MS Datazen
MS Excel BI
MS Power BI
MS Reporting Services
Revolution R
Amazon QuickSight
GoodData
IBM Cognos Reporting
IBM Watson Analytics
Microstrategy Analytics
Qlik Sense
Qlikview
SAP Business Objects
SAS Visual Analytics
Tableau
Teradata Aster Discovery Platform
Appliances
Oracle Exadata
Oracle SuperCluster
MS Analytic Platform System
IBM Netezza Twinfin
SAP HANA
Teradata Data Warehouse Appliance
HP Vertica Analytics System
Microsoft Stack
31
Excel + Power BI add-ins
Query, Pivot, View, Map
SharePoint
Power Pivot Gallery, Power View
Excel
Data Mining
Power BI Desktop Power BI Portal
Azure ML
End-to-End DW & Big Data Platform, Driving Analytics on any Data
Power BI Mobile App
Analytics Platform System
(APS)
Oracle Stack
Data Warehouse Components
Data Stores Access Tools Metadata
Data
Integration
Tools
Administration
and
Management
Development
Tools
Which ones are missing?
People? BICC?

More Related Content

What's hot

Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Mark Rittman
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and Hadoop
Mark Kromer
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Dipti Borkar
 
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland Bouman
 
Pitfalls of Data Warehousing_2019-04-24
Pitfalls of Data Warehousing_2019-04-24Pitfalls of Data Warehousing_2019-04-24
Pitfalls of Data Warehousing_2019-04-24
Martin Bém
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra
 
Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)
James Serra
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Denodo
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
Roy Kim
 
Technological insights behind Clusterpoint database
Technological insights behind Clusterpoint databaseTechnological insights behind Clusterpoint database
Technological insights behind Clusterpoint database
Clusterpoint
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
James Serra
 
Pentaho Analytics on MongoDB
Pentaho Analytics on MongoDBPentaho Analytics on MongoDB
Pentaho Analytics on MongoDB
Mark Kromer
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
James Serra
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
James Serra
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
Stéphane Fréchette
 
Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
VMware Tanzu
 
Data Vault Vs Data Lake
Data Vault Vs Data LakeData Vault Vs Data Lake
Data Vault Vs Data Lake
Calum Miller
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
Rittman Analytics
 

What's hot (20)

Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and Hadoop
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
 
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
 
Pitfalls of Data Warehousing_2019-04-24
Pitfalls of Data Warehousing_2019-04-24Pitfalls of Data Warehousing_2019-04-24
Pitfalls of Data Warehousing_2019-04-24
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
 
Technological insights behind Clusterpoint database
Technological insights behind Clusterpoint databaseTechnological insights behind Clusterpoint database
Technological insights behind Clusterpoint database
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Pentaho Analytics on MongoDB
Pentaho Analytics on MongoDBPentaho Analytics on MongoDB
Pentaho Analytics on MongoDB
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
Varadarajan CV
Varadarajan CVVaradarajan CV
Varadarajan CV
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
 
Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
 
Data Vault Vs Data Lake
Data Vault Vs Data LakeData Vault Vs Data Lake
Data Vault Vs Data Lake
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
 

Similar to Prague data management meetup 2017-01-23

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
James Serra
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data Lake
Trivadis
 
20160317 - PAZUR - PowerBI & R
20160317  - PAZUR - PowerBI & R20160317  - PAZUR - PowerBI & R
20160317 - PAZUR - PowerBI & R
Łukasz Grala
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQL
Philippe Julio
 
IaaS, PaaS, and DevOps for Data Scientist
IaaS, PaaS, and DevOps for Data ScientistIaaS, PaaS, and DevOps for Data Scientist
IaaS, PaaS, and DevOps for Data Scientist
Dmitry Petukhov
 
Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008
Tobias Koprowski
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
James Serra
 
Best practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power bi
Satya Shyam K Jayanty
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
James Serra
 
MDS ap_OEM Product Portfolio Intorduction to the DT & Analytics
MDS ap_OEM Product Portfolio Intorduction to the DT & AnalyticsMDS ap_OEM Product Portfolio Intorduction to the DT & Analytics
MDS ap_OEM Product Portfolio Intorduction to the DT & Analytics
MDS ap
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solves
Denodo
 
DA_01_Intro.pptx
DA_01_Intro.pptxDA_01_Intro.pptx
DA_01_Intro.pptx
Alok Mohapatra
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
kcmallu
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
Ashnikbiz
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
Rakesh Jayaram
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
Elena Lopez
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
Amazon Web Services
 
What’s new in SQL Server 2017
What’s new in SQL Server 2017What’s new in SQL Server 2017
What’s new in SQL Server 2017
James Serra
 

Similar to Prague data management meetup 2017-01-23 (20)

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data Lake
 
20160317 - PAZUR - PowerBI & R
20160317  - PAZUR - PowerBI & R20160317  - PAZUR - PowerBI & R
20160317 - PAZUR - PowerBI & R
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQL
 
IaaS, PaaS, and DevOps for Data Scientist
IaaS, PaaS, and DevOps for Data ScientistIaaS, PaaS, and DevOps for Data Scientist
IaaS, PaaS, and DevOps for Data Scientist
 
Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Best practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power bi
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
 
MDS ap_OEM Product Portfolio Intorduction to the DT & Analytics
MDS ap_OEM Product Portfolio Intorduction to the DT & AnalyticsMDS ap_OEM Product Portfolio Intorduction to the DT & Analytics
MDS ap_OEM Product Portfolio Intorduction to the DT & Analytics
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solves
 
DA_01_Intro.pptx
DA_01_Intro.pptxDA_01_Intro.pptx
DA_01_Intro.pptx
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
What’s new in SQL Server 2017
What’s new in SQL Server 2017What’s new in SQL Server 2017
What’s new in SQL Server 2017
 

More from Martin Bém

Prague data management meetup #30 2019-10-04
Prague data management meetup #30 2019-10-04Prague data management meetup #30 2019-10-04
Prague data management meetup #30 2019-10-04
Martin Bém
 
Prague data management meetup #31 2020-01-27
Prague data management meetup #31 2020-01-27Prague data management meetup #31 2020-01-27
Prague data management meetup #31 2020-01-27
Martin Bém
 
Meetup 2018-10-23
Meetup 2018-10-23Meetup 2018-10-23
Meetup 2018-10-23
Martin Bém
 
Prague data management meetup 2018-04-17
Prague data management meetup 2018-04-17Prague data management meetup 2018-04-17
Prague data management meetup 2018-04-17
Martin Bém
 
Prague data management meetup 2018-05-22
Prague data management meetup 2018-05-22Prague data management meetup 2018-05-22
Prague data management meetup 2018-05-22
Martin Bém
 
Prague data management meetup 2018-02-27
Prague data management meetup 2018-02-27Prague data management meetup 2018-02-27
Prague data management meetup 2018-02-27
Martin Bém
 
Prague data management meetup 2018-01-30
Prague data management meetup 2018-01-30Prague data management meetup 2018-01-30
Prague data management meetup 2018-01-30
Martin Bém
 
Prague data management meetup 2017-11-21
Prague data management meetup 2017-11-21Prague data management meetup 2017-11-21
Prague data management meetup 2017-11-21
Martin Bém
 
Prague data management meetup 2017-10-24
Prague data management meetup 2017-10-24Prague data management meetup 2017-10-24
Prague data management meetup 2017-10-24
Martin Bém
 
Prague data management meetup 2017-09-26
Prague data management meetup 2017-09-26Prague data management meetup 2017-09-26
Prague data management meetup 2017-09-26
Martin Bém
 
Prague data management meetup 2017-05-16
Prague data management meetup 2017-05-16Prague data management meetup 2017-05-16
Prague data management meetup 2017-05-16
Martin Bém
 
Prague data management meetup 2017-03-28
Prague data management meetup 2017-03-28Prague data management meetup 2017-03-28
Prague data management meetup 2017-03-28
Martin Bém
 
Prague data management meetup 2017-04-25
Prague data management meetup 2017-04-25Prague data management meetup 2017-04-25
Prague data management meetup 2017-04-25
Martin Bém
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28
Martin Bém
 
Prague data management meetup 2016-11-22
Prague data management meetup 2016-11-22Prague data management meetup 2016-11-22
Prague data management meetup 2016-11-22
Martin Bém
 
Prague data management meetup 2016-10-17
Prague data management meetup 2016-10-17Prague data management meetup 2016-10-17
Prague data management meetup 2016-10-17
Martin Bém
 
Prague data management meetup 2016-09-22
Prague data management meetup 2016-09-22Prague data management meetup 2016-09-22
Prague data management meetup 2016-09-22
Martin Bém
 
Prague data management meetup 2016-03-07
Prague data management meetup 2016-03-07Prague data management meetup 2016-03-07
Prague data management meetup 2016-03-07
Martin Bém
 
Prague data management meetup 2016-01-12 pub
Prague data management meetup 2016-01-12 pubPrague data management meetup 2016-01-12 pub
Prague data management meetup 2016-01-12 pub
Martin Bém
 
Prague data management meetup 2015 11-23
Prague data management meetup 2015 11-23Prague data management meetup 2015 11-23
Prague data management meetup 2015 11-23
Martin Bém
 

More from Martin Bém (20)

Prague data management meetup #30 2019-10-04
Prague data management meetup #30 2019-10-04Prague data management meetup #30 2019-10-04
Prague data management meetup #30 2019-10-04
 
Prague data management meetup #31 2020-01-27
Prague data management meetup #31 2020-01-27Prague data management meetup #31 2020-01-27
Prague data management meetup #31 2020-01-27
 
Meetup 2018-10-23
Meetup 2018-10-23Meetup 2018-10-23
Meetup 2018-10-23
 
Prague data management meetup 2018-04-17
Prague data management meetup 2018-04-17Prague data management meetup 2018-04-17
Prague data management meetup 2018-04-17
 
Prague data management meetup 2018-05-22
Prague data management meetup 2018-05-22Prague data management meetup 2018-05-22
Prague data management meetup 2018-05-22
 
Prague data management meetup 2018-02-27
Prague data management meetup 2018-02-27Prague data management meetup 2018-02-27
Prague data management meetup 2018-02-27
 
Prague data management meetup 2018-01-30
Prague data management meetup 2018-01-30Prague data management meetup 2018-01-30
Prague data management meetup 2018-01-30
 
Prague data management meetup 2017-11-21
Prague data management meetup 2017-11-21Prague data management meetup 2017-11-21
Prague data management meetup 2017-11-21
 
Prague data management meetup 2017-10-24
Prague data management meetup 2017-10-24Prague data management meetup 2017-10-24
Prague data management meetup 2017-10-24
 
Prague data management meetup 2017-09-26
Prague data management meetup 2017-09-26Prague data management meetup 2017-09-26
Prague data management meetup 2017-09-26
 
Prague data management meetup 2017-05-16
Prague data management meetup 2017-05-16Prague data management meetup 2017-05-16
Prague data management meetup 2017-05-16
 
Prague data management meetup 2017-03-28
Prague data management meetup 2017-03-28Prague data management meetup 2017-03-28
Prague data management meetup 2017-03-28
 
Prague data management meetup 2017-04-25
Prague data management meetup 2017-04-25Prague data management meetup 2017-04-25
Prague data management meetup 2017-04-25
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28
 
Prague data management meetup 2016-11-22
Prague data management meetup 2016-11-22Prague data management meetup 2016-11-22
Prague data management meetup 2016-11-22
 
Prague data management meetup 2016-10-17
Prague data management meetup 2016-10-17Prague data management meetup 2016-10-17
Prague data management meetup 2016-10-17
 
Prague data management meetup 2016-09-22
Prague data management meetup 2016-09-22Prague data management meetup 2016-09-22
Prague data management meetup 2016-09-22
 
Prague data management meetup 2016-03-07
Prague data management meetup 2016-03-07Prague data management meetup 2016-03-07
Prague data management meetup 2016-03-07
 
Prague data management meetup 2016-01-12 pub
Prague data management meetup 2016-01-12 pubPrague data management meetup 2016-01-12 pub
Prague data management meetup 2016-01-12 pub
 
Prague data management meetup 2015 11-23
Prague data management meetup 2015 11-23Prague data management meetup 2015 11-23
Prague data management meetup 2015 11-23
 

Recently uploaded

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 

Recently uploaded (20)

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 

Prague data management meetup 2017-01-23

  • 1. Komponenty datových skladů #11 23. 1. 2017 Prague Data Management Meetup
  • 2. Agenda • Prague Data Management Meetup • Komponenty datových skladů
  • 3. Prague Data Management Meetup Data Management Získávaní dat Ukládání dat Zpracování dat Interpretace dat Použití dat • Otevřená profesionální zájmová skupina • Každý je vítán (ať už v pasivní nebo aktivní roli) • Témat není nikdy dost • Snaha o pravidelné měsíční setkávání • Fungujeme od září 2015
  • 4. Historie Datum Téma 10. 9. 2015 Data Management 14. 10. 2015 Data Lake 23. 11. 2015 Dark Data (without Dark Energy and Dark Force) 12. 1. 2016 Data Lake (znova) 7. 3. 2016 Sad Stories About DW Modeling (sad stories only) 23. 3. 2016 Self-service BI Street Battle 27. 4. 2016 Let's explore the new Microsoft PowerBI! 22. 9. 2016 Data Management pro začátečníky 17. 10. 2016 Small Big Data 22. 11. 2016 Základy modelování DW 23.1.2017 Komponenty datových skladů
  • 7. Data Warehouse • Data integration from various data sources in requested quality and time • Publish and share consistent information for purposes and users • Flexible and effective ad-hoc reporting a analysis • Features • Subject Orientation • Data Integration • Low variability • History • Main perspectives • Data Integration • Data Storage • Data Access • New oppurtunities • Complex Event Processing in real time • Application Integration • Real-time decision support • Operational Data Store • Integration with Big Data Platform EDW, DW, DSS, ADS, ADW, DP…
  • 8. Operational Database vs. Data Warehouse Characteristic Operational Database Data Warehouse Currency Current Historical Details level Individual Individual and summary Orientation Process Subject Records per request Few Thousands Normalization level Mostly normalized Normalization relaxed Update level Highly volatile Mostly refreshed (non volatile) Data model Relational Relational (star schemas) and multidimensional (data cubes) Source: Coursera
  • 9. Adastra Business Intelligence Reference Architecture 9 ODS Operational reporting Enterprise DWH Big Data Platform Data Lake Event Processing Semantic Models Advanced Analytics Perceptual / cognitive intelligence Information Management Relational / Structured data Unstructured data Streaming Data Workflow Orchestration Data Transformation / Processing Data Management Event Ingestion Complex Event Processing Notifications BI / Application Integration Machine Learning In-database Data Mining, R Recognition of human interaction and intent SMP and MPP In-memory technologies In-memory Columnar In-memory technologies Hadoop, NoSQL Business Intelligence / Data Delivery Real-time DashboardsDashboards and visualizationsReports Self-service BIMobile BI IoT Network Field Gateway Big data OLAP
  • 10.
  • 11. DW Logical Layers L0: Stage Area L1: Relational Area L1: Consolidation Area L2: Data Mart Area • Data Mart Area • L2 • User Access Layer • Consolidation Area • Consolidated L1 • Common aggregates for L2 • Cleansed and consolidated data • Relational Area • Detailed L1 • Consistent, integrated, subject oriented data, universal data structure, historical data, maximal detail • System of record • Foundation Layer • Stage Area • Direct copy of source systems Extracts Reports Note: Consolidated and Detailed L1 can share same data structures General DWH Staging Area ODS Presentation Layer Datamart Area (Dependent Datamarts) Source systems Customer DB ETL Other...S4S3S2S1 Analytic tools (SPSS, SAS..) OLAP S1 S2 S3 S4 Other S1 Ostatní...S4S3S2 ETL Materialization OLAP? ETL ETL ETL ETL ETL ETL ETL ETL ETL ETL CDB ETL EAI ReportingReporting Reporting Reporting Relational Area ETL Application Application Materialization Application Application ETL
  • 12. Data Warehouse Components Data Stores Access Tools Metadata Data Integration Tools Administration and Management Development Tools Not only technology!
  • 13. Data Stores Logical Stores • Data Warehouse • Data Mart • Operational Data Store • Customer Data Integration • Product Data Integration • Data Hub • Data Lake • Data Archive • Big Data Platform Physical Stores • RDBMS • OLAP • HDFS • NoSQL • SMP • MPP • Cluster • Appliance
  • 14. Data Integration Tools Custom Scripts ELT vs. ETL Real-time Change Data Capture Logical Mapping Physical Mapping Workflow Dependencies Restartability Error Handling Scheduling Events Monitoring QA Testing Design Patterns Design Standards Kappa Architecture Lambda Architecture Data Quality Matching Data Cleanup Data Profiling Data Checks
  • 15. Data Sources and Targets (ODI Examples) Apache Derby Apache HDFS Apache Hive Apache HBase Cloudera CDH dBase HyperSQL Database Engine IBM DB2 IBM DB2 for Linux Unix and Windows IBM DB2 for i IBM DB2 for z/OS IBM Informix IBM Informix Dynamic Server (DS) IBM Informix Extended Parallel Server (XPS) IBM Netezza NPS IBM Websphere MQ Ingres InterBase ISO Database Language SQL (generic SQL-92 database) Java Message Service (JMS) Microsoft Access Microsoft Excel Microsoft SQL Server MySQL Server Oracle Database Oracle Essbase Oracle Hyperion Planning Oracle Service Bus Oracle TimesTen In-Memory Database Paradox Pervasive PSQL PostgreSQL SAP BW SAP ERP ECC Teradata Database Textové soubory XML soubory JSON soubory
  • 16. Batch Data Transformation: ETL vs. ELT Extract Load Transformation Extract Transformation Load
  • 17. Data Transformation Model Pattern Source table Target table table Target table Filter SRC Filter TRG Differential member (minus or outer join or Merge) Filter OUT Lookup tables Join SRC
  • 22.
  • 23. Design Standards • Naming conventions • Models, nodes, layers, schemas • Entities • Attributes • Keys • Relationships • Indexes, constraints • Level of normalization • Level of convergence • Key strategy • Standard attributes Supplier standards Customer standards Third party product standards
  • 24. Typical Metadata Backbone Primary generic principles of the data warehouse solution (goals and objectives) Logical architecture document Physical architecture document Environment map Logical data model name and design standard Data history and retention strategy Physical data model design standard including partitioning List of acronyms Business rule document KPI catalogue Transformation design standard Process definition for analysis, design, development, testing, release management and bug fixing Templates for analysis, design, development, testing, release management and bug fixing Test catalog Developer guideline Operation guideline including DR strategy Quality assurance SLA templates Information delivery strategy Data Mart strategy Data Quality strategy Business entity mapping to data model with examples for end users Business data domain document Security architecture
  • 25. Access Tools • Microsoft Excel rulez • SQL Query Tools • Enteprise Reporting Tools • Self-service BI • Data Discovery • Data Mining Tools (Weka, Azure ML..) • Statistics Tools (SPSS, SAS, R…) • Information Delivery • Real-time decision • Application integration
  • 26.
  • 27. Oracle Business Intelligence • On-premise i cloud varianta • Podpora pro pokročilou analytiku, self-service vizualizace i Mobile BI Oracle Big Data Discovery • Nativní self-service analytika pro Big Data řešení 27
  • 28. Tableau Desktop Qlik Sense Desktop Microsoft Power BI Desktop SAS Visual Analytics
  • 29. Administration and Management Daily operations Environments Data quality checks Managing and updating metadata Auditing and reporting data warehouse usage and status Purging data Replicating, sub- setting and distributing data Backup and recovery Data warehouse storage management. Bug tracking and fixing
  • 30. DW Stacks Stack Others RDBMS Oracle Database MySQL Microsoft SQL Server Microsoft SQL Server APS Azure SQL Data Warehouse Amazon Redshift HP Vertica IBM dashDB IBM DB2 PostgreSQL SAP HANA SAP IQ SAP SQL Anywhere Teradata Database ETL/ELT Oracle Data Integrator Oracle Golden Gate MS Integration Services Azure Data Factory Clover ETL IBM InfoSphere DataStage Informatica PowerCenter Pentaho Data Integration SAP Data Services SAS Data Integration Talend Data Integration BI & Analytics Oracle Big Data Discovery Oracle Business Intelligence Oracle Endeca Data Discovery Oracle Essbase Oracle R Enterprise Azure Machine Learning MS Analysis Services MS Datazen MS Excel BI MS Power BI MS Reporting Services Revolution R Amazon QuickSight GoodData IBM Cognos Reporting IBM Watson Analytics Microstrategy Analytics Qlik Sense Qlikview SAP Business Objects SAS Visual Analytics Tableau Teradata Aster Discovery Platform Appliances Oracle Exadata Oracle SuperCluster MS Analytic Platform System IBM Netezza Twinfin SAP HANA Teradata Data Warehouse Appliance HP Vertica Analytics System
  • 31. Microsoft Stack 31 Excel + Power BI add-ins Query, Pivot, View, Map SharePoint Power Pivot Gallery, Power View Excel Data Mining Power BI Desktop Power BI Portal Azure ML End-to-End DW & Big Data Platform, Driving Analytics on any Data Power BI Mobile App Analytics Platform System (APS)
  • 33. Data Warehouse Components Data Stores Access Tools Metadata Data Integration Tools Administration and Management Development Tools Which ones are missing? People? BICC?