SlideShare a Scribd company logo
1 of 76
Download to read offline
A company of Daimler AG
LECTURE @DHBW: DATA WAREHOUSE
PART IV: FRONTEND, METADATA,
PROJECTS, ADVANCED TOPICS
ANDREAS BUCKENHOFER, DAIMLER TSS
ABOUT ME
https://de.linkedin.com/in/buckenhofer
https://twitter.com/ABuckenhofer
https://www.doag.org/de/themen/datenbank/in-memory/
http://wwwlehre.dhbw-stuttgart.de/~buckenhofer/
https://www.xing.com/profile/Andreas_Buckenhofer2
Andreas Buckenhofer
Senior DB Professional
andreas.buckenhofer@daimler.com
Since 2009 at Daimler TSS
Department: Big Data
Business Unit: Analytics
DAIMLER TSS. IT EXCELLENCE: COMPREHENSIVE, INNOVATIVE, CLOSE.
We're a specialist and strategic business partner for innovative IT Solutions within Daimler –
not just another supplier!
As a 100% subsidiary of Daimler, we live the culture of excellence and aspire to take an
innovative and technological lead.
With our outstanding technological and methodical competence we are a competent provider of
services that help those who benefit from them to stand out from the competition. When it
comes to demanding IT questions we create impetus, especially in the core fields car IT and
mobility, information security, analytics, shared services and digital customer
experience.
Data Warehouse / DHBWDaimler TSS GmbH 3
TSS 2 0 2 0 ALWAYS ON THE MOVE.
Daimler TSS GmbH 4
LOCATIONS
Data Warehouse / DHBW
Daimler TSS China
Hub Beijing
6 Employees
Daimler TSS Malaysia
Hub Kuala Lumpur
38 Employees
Daimler TSS India
Hub Bangalore
16 Employees
Daimler TSS Germany
6 Locations
More than1000 Employees
Ulm (Headquarters)
Stuttgart Area
Böblingen, Echterdingen,
Leinfelden, Möhringen
Berlin
• After the end of this lecture you will be able to
• Understand function of Frontend Tools and Information Design
• Understand the necessity for metadata
• Understand lifecycle of DWH projects
• Advanced topics like Operational BI, DWH Appliances, Cloud BI
WHAT YOU WILL LEARN TODAY
Data Warehouse / DHBWDaimler TSS 5
FRONTEND
LOGICAL STANDARD DATA WAREHOUSE ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 7
Data Warehouse
FrontendBackend
External data sources
Internal data sources
Staging
Layer
(Input
Layer)
OLTP
OLTP
Core
Warehouse
Layer
(Storage
Layer)
Mart Layer
(Output
Layer)
(Reporting
Layer)
Integration
Layer
(Cleansing
Layer)
Aggregation
Layer
Metadata Management
Security
DWH Manager incl. Monitor
• Reporting (Standard, ad-hoc)
• OLAP
• Dashboards, Scorecards
• Advanced Analytics / Data Mining / Text Mining
• Search & Discovery
INTERFACE TO THE END USER
Data Warehouse / DHBWDaimler TSS 8
Standard Reports
• Prepared static reports that can be executed at request by end users
• Are executed at the end of an ETL process and e.g. send by email to end users
• Normally based on fact tables and its dimensions
• Reports are often lists similar to Excel-Sheets but can also contain graphics (e.g. line
charts)
Ad-hoc Reports
• End users create their own reports („Self service“)
REPORTING (STANDARD, AD-HOC)
Data Warehouse / DHBWDaimler TSS 9
ROLAP / MOLAP Client Frontend
• Prepared cubes (multidimensional or relational fact tables)
• User can perform interactive analysis of data
• Rollup / drill-down
• Pivot
• Slicing
• Dicing
OLAP
Data Warehouse / DHBWDaimler TSS 10
„Progress reports“
Provide an overall view of KPIs (Key Performance Indicators)
Combination of several elements from Reporting and/or OLAP (e.g. line
charts) into an overall view (like a „cockpit“)
Dashboard is more focused on operational goals
• High-level overview what is happening
Scorecard is more focused on strategic goals
• Plan a strategy and identify why something happens
DASHBOARDS, SCORECARDS
Data Warehouse / DHBWDaimler TSS 11
See Mr. Bollinger‘s lecture
ADVANCED ANALYTICS / DATA MINING / TEXT MINING
Data Warehouse / DHBWDaimler TSS 12
Not just numerical data
Analysis of new data types gets more and more important
• Text
• GPS coordinates
• Pictures
• Videos
Data can be available in RDBMS (e.g. text modules/indexes available),
Hadoop or SQL DBs
SEARCH & DISCOVERY
Data Warehouse / DHBWDaimler TSS 13
MANY GRAPHICAL ELEMENTS TO USE IN REPORTS
Data Warehouse / DHBWDaimler TSS 14
Source: https://github.com/d3/d3/wiki/Gallery
MANY GRAPHICAL ELEMENTS … CHAMBER OF HORROR
Data Warehouse / DHBWDaimler TSS 15
Source: Hichert / Faisst, http://www.backup-page.hichert.com/
Some remarks about previous slide
• 3D elements introduce clutter and give not more information
• Pie chart most often does not make sense
• Line chart barely readable
• Labels are placed outside of the graphic
• Tachometer costs a lot of space and show
• Too much color in general
• Color without meaning, e.g. red should be used for alarms / errors
MANY GRAPHICAL ELEMENTS … CHAMBER OF HORROR
Data Warehouse / DHBWDaimler TSS 16
STORY TELLING WITH APPROPRIATE VISUALIZATION
Famous example by Hans Rosling (watch 3:08 onwards)
https://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen?language=de
Data Warehouse / DHBWDaimler TSS 17
Information design is the practice of presenting information in a way that
fosters efficient and effective understanding of it.
(source: Wikipedia, https://en.wikipedia.org/wiki/Information_design )
Some authors are well known for their criticism of many graphical
representations - they provide rules for good information design
• Edward Tufte
• Stephen Few
• Rolf Hichert
INFORMATION DESIGN
Data Warehouse / DHBWDaimler TSS 18
Define standards, e.g.
• use always the same colors and with care (red = negative, green = positive)
• pie charts are rarely useful and should be avoided (better use bar chart or line chart)
• No 3D elements as these elements don’t enhance information but introduce clutter
INFORMATION DESIGN
Data Warehouse / DHBWDaimler TSS 19
TABLE WITH INTEGRATED BAR CHARTS
Data Warehouse / DHBWDaimler TSS 20
Source: Hichert, http://www.hichert.com/de/resource/table-template-02/
Consumers / BI Users
• use reports, OLAP and dashboards to obtain information
Power Users
• Use reports , OLAP and dashboards to obtain information
• Create new reports and dashboards
Data Scientists
• Statistical / mathematical geeks
• Analyze / explore data
• Need to analyze raw (non-cleansed, non-transformed) data
BI END USER ROLES
Data Warehouse / DHBWDaimler TSS 21
META DATA
WHAT IS METADATA?
Data Warehouse / DHBWDaimler TSS 23
Data
about
other data
Business Metadata
• Definition of business vocabulary and relationships
• Definition of the value range
• Linkage to physical representation
TYPES OF METADATA
Data Warehouse / DHBWDaimler TSS 24
Report metadata
• Report definitions
• Data sources
• Column definitions
• Computations
Logical and physical metadata of data model
• Table structure
• Definition of columns
• Relationships between tables and columns
• Dimension hierarchy
TYPES OF METADATA
Data Warehouse / DHBWDaimler TSS 25
THE AREAS OF METADATA
Data Warehouse / DHBWDaimler TSS 26
THE AREAS OF METADATA CONNECTED
Data Warehouse / DHBWDaimler TSS 27
Components of a data warehouse system are interconnected
• BI report user has to know
• the meaning, definitions of the shown measures, „KPIs“ (key performance indicators)
• BI report designer has to know
• the table definitions
• the meaning of the column values
• ETL job designer has to know
• the table definitions or the exact definition of the measures
• Database administrator has to know
• Which tables are used by ETL jobs, reports
WHY A COMMON METADATA REPOSITORY?
Data Warehouse / DHBWDaimler TSS 28
Metadata driven ETL development
• Generate parts of ETL code
• increasing interest for Data Vault development projects
• Tools e.g. MID Innovator, Quipu, AnalytiX DS, Talend, Pentaho, Wherescape, and others
Common metadata repository ensures consistency across all components
• Many tools involved (DB, ETL, Frontend, …)
Enables cross component metadata analysis
• Data Lineage
• Impact Analysis
WHY A COMMON METADATA REPOSITORY?
Data Warehouse / DHBWDaimler TSS 29
“Data lineage”
• Import & Browse Full BI Report Metadata
• Navigate through report attributes
• Visually navigate through data lineage across tools
• Combines
operational &
design viewpoint
WHY A COMMON METADATA REPOSITORY?
Data Warehouse / DHBWDaimler TSS 30
“Impact Analysis”
• Show complete change impact in graphical or list form
• Includes impact on reports in BI tools
• Visually navigate through impacted objects across tools
• Allows impact analysis on any object type
WHAT HAPPENS IF I CHANGE THIS COLUMN?
Data Warehouse / DHBWDaimler TSS 31
• Show relationships between business terms, data model entities, and
technical and report fields
• Requires cross-tool mapping of business terms
• Allows field meaning to be understood
• Allows business term relationships to be understood
WHAT DOES THIS FIELD MEAN?
Data Warehouse / DHBWDaimler TSS 32
• Shows objects that user manages
• Shows stewardship relationships on business terms
• Shows user group associations
WHAT OBJECTS DOES THIS USER OWN?
Data Warehouse / DHBWDaimler TSS 33
• Navigation through
complete job details
• Navigation of complete
operational metadata
WHAT HAPPENED ON THE LAST JOB RUN?
Data Warehouse / DHBWDaimler TSS 34
DATA WAREHOUSING PROJECTS
LOGICAL STANDARD DATA WAREHOUSE ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 36
Data Warehouse
FrontendBackend
External data sources
Internal data sources
Staging
Layer
(Input
Layer)
OLTP
OLTP
Core
Warehouse
Layer
(Storage
Layer)
Mart Layer
(Output
Layer)
(Reporting
Layer)
Integration
Layer
(Cleansing
Layer)
Aggregation
Layer
Metadata Management
Security
DWH Manager incl. Monitor
Top Down (Inmon)
Bottom Up (Kimball)
Top-Down (Inmon)
• Comprehensive approach regarding available data
• Design Core Warehouse Layer = integrated data model first considering all
requirements
• Design data marts afterwards
Bottom-Up (Kimball)
• Approach focusing on fast delivery of first results
• Design one data mart first
• Next Marts are modeled afterwards usually using Kimball architecture
• conformed dimensions to integrate different data marts / fact tables
TOP-DOWN VS BOTTOM-UP APPROACH
Data Warehouse / DHBWDaimler TSS 37
TOP-DOWN VS BOTTOM-UP APPROACH
ADVANTAGES AND DISADVANTAGES
Data Warehouse / DHBWDaimler TSS 38
Top-Down (Inmon) Bottom-Up (Kimball)
☺ Core Warehouse Layer is designed optimal ☺ Early involvement of end users
☺ Data from Core Warehouse Layer is reused in many
Marts
☺ Fast results
Time-consuming approach with high preparatory effort Focus on single Marts leads to risk that overall view is
lost, esp. properly designed Core Warehouse Layer
High risk with changing requirements Data often not reused but inconsistently copied across
Marts
Both approaches have their down-sides
• Top-Down takes enormous initial effort to build data model for Core Warehouse Layer
• Bottom-Up is risky as central / integrated focus is lost
Think big, start small
• Think Big: Design conceptual data model for Core Warehouse Layer
covering whole enterprise
• Start small: Implement physical data model for Core and Mart Layer in
iterations by each business department
THINK BIG, START LOCAL
Data Warehouse / DHBWDaimler TSS 39
• Classical
• Waterfall model
• Incremental model
• Agile
• Scrum
• Kanban
PROCESS MODEL
Data Warehouse / DHBWDaimler TSS 40
• DWH is not a system or product
• DWH databases are more complex with different layers and data models
• Data first, code is secondary
• Data quality is a major concern
• Data integration is a challenging objective
• Business need difficult to justify quantitatively
WHAT’S DIFFERENT IN DWH PROJECTS?
Data Warehouse / DHBWDaimler TSS 41
WHY DO DWH PROJECTS FAIL?
Data Warehouse / DHBWDaimler TSS 42
Feasibility study Analysis Design Implementation Test
Operations and
maintenance
PROJECT PHASES
Data Warehouse / DHBWDaimler TSS 43
• Benefits of DWH
• Cost-effectiveness
• SW selection
• HW selection
• Staff requirement including external Know-How
• Data protection and data security agreement, data classification
• Proof of Concept (PoC) to challenge different possible solutions
• Architectural concept
FEASIBILITY STUDY
Data Warehouse / DHBWDaimler TSS 44
• Documentation of user requirements: specification sheet
• Backend including ETL
• Frontend
• Security
• Metadata, Business glossary
• Non-functional requirements
• Analysis of data sources
• Data quality
• Data models
• Data security
ANALYSIS
Data Warehouse / DHBWDaimler TSS 45
• Technical description how to implement specifications
• Data model for different DWH layers
• Data integration design
• Frontend design
• Security concept
• Capacity planning
DESIGN
Data Warehouse / DHBWDaimler TSS 46
• Installation of development, test, integration, production, maintenance
environment
• Usage of Metadata repository for implementation of data model, etl,
frontend, security
• Launch of DWH
• Release Management
IMPLEMENTATION
Data Warehouse / DHBWDaimler TSS 47
• Functional
• Data quality / data validation
• Usability
• Performance
• Operational
• Security
TEST
Data Warehouse / DHBWDaimler TSS 48
• Deployment of new features, changes or bug fixes
• End user training
• Monitoring
• Production concept
• Initial load and future delta loads
• Keep the system running
OPERATIONS AND MAINTENANCE
Data Warehouse / DHBWDaimler TSS 49
Organizational team that coordinate and standardize DWH activities within an
(end user) organization
• Define standards and create BI portfolio (e.g. which tools/products to use)
• Create DWH architecture and govern BI activities
• Establish processes for business and IT interaction
• Monitor DWH/BI market for new trends
• Determine skills and experience of Business users
BICC: BI CENTER OF EXCELLENCE
Data Warehouse / DHBWDaimler TSS 50
Define 3-5 criteria for the evaluation of an ETL tool
How does a relational DBMS (like Oracle, DB2, MS SQL Server) meet these
requirements?
EXERCISE
Data Warehouse / DHBWDaimler TSS 51
• Supplier profile
• Support
• HW/SW requirements
• License / maintenance Costs
• Usability
• Reliability
• Performance and scalability
• Multi-tenant
• Interfaces
• Scheduling
EXERCISE - DEFINE 5 CRITERIA FOR THE EVALUATION OF
AN ETL TOOL
Data Warehouse / DHBWDaimler TSS 52
• RDBMS provide many of the functionalities but additional programming required
• RDBMS are often used for ETL/ELT by programming with SQL, PL/SQL, SQLT, etc
EXERCISE - HOW DOES A RELATIONAL DBMS MEET
THESE REQUIREMENTS?
Data Warehouse / DHBWDaimler TSS 53
ETL Tool Manual ETL
Informatica, Talend, Oracle ODI, etc. SQL, PL/SQL, SQLT, etc.
Separate license No additional license
Workflow, error handling, and restart/recovery
functionality included
Workflow, error handling, and restart/recovery
functionality must be implemented manually
Impact analysis and where-used (lineage) functionality
available
Impact analysis and where-used (lineage) functionality
difficult
Faster development, easier maintenance Slower development, more difficult maintenance
Additional (Tool-) Know How required Know How often available
NEWER / ADVANCED TOPICS
• OPERATIONAL DATA WAREHOUSING
• DATA WAREHOUSE APPLIANCES
• CLOUD BI
Classical“ Data Warehouses
• Information in the warehouse used to support strategic business decisions
• Kept separate from operational systems
• Load of new data only in larger intervals (mostly weekly or monthly)
• Shorter intervals not required by users
• Huge system resources of the ETL process
Near Real Time Operational Data Warehousing
• Information in the warehouse used for tactical business decisions as well
• Low latency of information in data warehouse therefore needed
• Not only mathematical aggregations
OPERATIONAL DATA WAREHOUSING
Data Warehouse / DHBWDaimler TSS 55
With classical data warehouses users have to access two types of systems to
get a complete image of a customer (for instance for CRM applications or in
call centers)
• the data warehouse to see what happened in the past
• the OLTP systems to get the most current information
With an operational data warehouse
• all this information is in one system
• tighter integration with operational systems is easier
• for instance personalized offers „closing the loop“
WHY OPERATIONAL DATA WAREHOUSING?
Data Warehouse / DHBWDaimler TSS 56
SMARTFACTORY: OPERATIONAL BI SERVICE PLATFORM
Data Warehouse / DHBWDaimler TSS 57
Source: Gluchowski: Analytische Informationssysteme, 5.Aufl., p. 279
Workers getting
alarms on their
watch
Containing and
displaying
complex manuals,
e.g. during repair
New data
sources
sending lots
of data with
high speed
(sensor data,
logs, etc.)
Right-Time
data required
for
automated
actions, e.g.
cordless
screwdriver
knows and
adjusts
torque
Near Real time /Right time ETL
• Executed asynchronously; triggered by business transactions in the OLTP
• Incremental real-time load
• Tighter integration of operational and data warehouse systems
DWHs become „mission critical“
• Higher requirements on availability and performance
Higher „transactional“ system load on data warehouse system
• DWH DB has to deal with typical DWH system load and transactional load
Data Quality mandatory
• Data is used for automated decisions
CHALLENGES FOR OPERATIONAL DATA WAREHOUSING
Data Warehouse / DHBWDaimler TSS 58
COMPARISON CLASSICAL DWH – OPERATIONAL DWH
Data Warehouse / DHBWDaimler TSS 59
Classical DWH Operational DWH
Strategic
• Passive
• Historical trends
Tactical
• Prediction
• Automatic execution of decisions
Batch
• E.g. daily batch
Near Real-Time / Right-Time
• Up-to-date view
Lower Availability
• System can be down for
maintenance and longer response
times for some reports are accepted
Availability
• System becomes critical and must
fulfill high availability and
performance requirements
Setting up and configuring a data warehouse system is a complex task
• Hardware
• Servers + Storage + Network
• Connectivity to source systems
• Software
• Database management system
• ETL software
• Reporting and analytics software
• ...
An optimal performance of the whole system is difficult to achieve
DATA WAREHOUSE APPLIANCES
Data Warehouse / DHBWDaimler TSS 60
Data Warehouse Appliances are
• Pre-configured and pre-tested hard- and software configurations
developed for running a data warehouse
• Optimized for data warehousing workload / Only suited for running
OLAP
• In contrast one size fits all: RDBMS are suited for OLTP, OLAP and mixed workloads
• Ready to be used after they are delivered to the customer
• Products, e.g. Teradata, HP Vertica, Exasol, Oracle Exadata, IBM Netezza
(IBM PureData System for Analytics), MS Analytic Platform System
DATA WAREHOUSE APPLIANCES
Data Warehouse / DHBWDaimler TSS 61
APPLIANCE SIMPLICITY (E.G. ORACLE EXADATA)
Data Warehouse / DHBWDaimler TSS 62
• Move as many operations as possible to storage cell instead of moving
data to the DB server
• E.g. filter data already at storage cell and not at DB server
• Avoid transferring unnecessary data
• Column-oriented In-memory storage with high compression
• Many appliances are based on shared nothing architecture
• Each node is independent
• Each node has its own storage or memory
• Parallel processing simpler and faster as no overhead due to contention
TYPICAL ENHANCEMENTS
Data Warehouse / DHBWDaimler TSS 63
• BI applications (database, ETL tools, Frontend) are hosted in a public
cloud, e.g.
• AWS (Amazon Web Services)
• Microsoft Azure
• …
• Many tools nowadays are available in the cloud first
• Vendors try to force customers to use clouds
• Or even available in the cloud only
• E.g. Microsoft Power BI
CLOUD BI
Data Warehouse / DHBWDaimler TSS 64
• Security concerns for sensitive data
• But new data source coming from Internet. Storing the data in a (public)
cloud can make sense, e.g.
• Connected Cars, IOT in general
CLOUD BI
Data Warehouse / DHBWDaimler TSS 65
CLOUD BI ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 66
Source: Lang: Business Intelligence erfolgreich umsetzen, 5.Aufl., p. 185
• Analytics as a service
• Provide complete BI (Analytics) SW stack including
• data storage
• data integration (ETL)
• data visualization and/or data modeling (Frontend)
• Meta data management
• Data as a service
• Provide quality data for further usage
• Data marketplace
CLOUD BI ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 67
CLOUD BI – DATA WAREHOUSING SERVICES
Data Warehouse / DHBWDaimler TSS 68
Source: http://db-engines.com/en/system/Amazon+Redshift%3BSnowflake
SNOWFLAKE ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 69
Don‘t confuse
Snowflake
product with
Snowflake
dimensional
model from
session 2
Snowflake Storage
• Snowflake loads data into its internal optimized, compressed, columnar format
• Snowflake itself uses (!) Amazon Web Service’s S3 (Simple Storage Service) cloud
storage
Query Processing
• Each virtual warehouse is an MPP (Multi Parallel Processing) compute cluster
composed of multiple compute nodes allocated by Snowflake from Amazon EC2
• Each virtual warehouse is an independent compute cluster that does not share
compute resources with other virtual warehouses
SNOWFLAKE ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 70
Cloud Services
• Authentication and access control
• Infrastructure management
• Metadata management
• Query parsing and optimization
• Security
SNOWFLAKE ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 71
• Kimball et al: The Data Warehouse Lifecycle Toolkit, Wiley 2008
• Bauer / Günzel: Data-Warehouse-Systeme, dpunkt, 2013
• Köppen et al: Data Warehouse Technologien, mitp, 2016
END OF LECTURE
GOOD TEXT BOOKS
Data Warehouse / DHBWDaimler TSS 72
Source: http://dilbert.com/strip/2014-05-07
Daimler TSS GmbH
Wilhelm-Runge-Straße 11, 89081 Ulm / Telefon +49 731 505-06 / Fax +49 731 505-65 99
tss@daimler.com / Internet: www.daimler-tss.com/ Intranet-Portal-Code: @TSS
Domicile and Court of Registry: Ulm / HRB-Nr.: 3844 / Management: Christoph Röger (CEO), Steffen Bäuerle
Data Warehouse / DHBWDaimler TSS 73
THANK YOU
How to document / identify requirements?
• Must be easy to understand from non-technical users during Analysis/Technical
concept phase
• Must provide sufficient information for System Design phase
The following slides provide some example work products that are produced
during Analysis/Technical concept phase and may be refine during System
Design phase
POSSIBLE DWH ANALYSIS AND DESIGN WORK
PRODUCTS
Data Warehouse / DHBWDaimler TSS 74
• Answer most important questions of participating business units
• Provide high-quality data
• Introduction in time
• Usage of modern technology
• Business orientation
• Easy to use
• Executive sponsor
• Patience – user acceptance evolves over time
CRITICAL SUCCESS FACTORS FOR BUILDING A DATA
WAREHOUSE
Data Warehouse / DHBWDaimler TSS 75
• New applications and data sources
• Increase demand for an
• Operational DWH, e.g.
• Industry 4.0 / Smart Factory
• Internet Of Things
• Internet of medical things
• Connected Cars
EXAMPLES OF OPERATIONAL DATA WAREHOUSING
Data Warehouse / DHBWDaimler TSS 76
Source: Gluchowski: Analytische Informationssysteme, 5.Aufl., p. 277
Replace pen &
paper with
electronic
workflows
Decision support for each end user
and not only management
Increasing demand to publish same
content on different devices

More Related Content

What's hot

Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasirguest7c8e5f
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sqldeathsubte
 
Data Warehousing and Bitmap Indexes - More than just some bits
Data Warehousing and Bitmap Indexes  - More than just some bitsData Warehousing and Bitmap Indexes  - More than just some bits
Data Warehousing and Bitmap Indexes - More than just some bitsTrivadis
 
Architecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyArchitecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyMark Ginnebaugh
 
SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...
SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...
SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...Edureka!
 
business analysis-Data warehousing
business analysis-Data warehousingbusiness analysis-Data warehousing
business analysis-Data warehousingDhilsath Fathima
 
Module 02 teradata basics
Module 02 teradata basicsModule 02 teradata basics
Module 02 teradata basicsMd. Noor Alam
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best PracticesEduardo Castro
 
SAP BOBJ Rapid Mart Overview & Implementation
SAP BOBJ Rapid Mart Overview & ImplementationSAP BOBJ Rapid Mart Overview & Implementation
SAP BOBJ Rapid Mart Overview & ImplementationRamakrishna Kamurthy
 
Delta machenism with db connect
Delta machenism with db connectDelta machenism with db connect
Delta machenism with db connectObaid shaikh
 
sap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanasap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanaJames L. Lee
 

What's hot (20)

An Introduction To BI
An Introduction To BIAn Introduction To BI
An Introduction To BI
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
 
SAS/Tableau integration
SAS/Tableau integrationSAS/Tableau integration
SAS/Tableau integration
 
Data Warehousing and Bitmap Indexes - More than just some bits
Data Warehousing and Bitmap Indexes  - More than just some bitsData Warehousing and Bitmap Indexes  - More than just some bits
Data Warehousing and Bitmap Indexes - More than just some bits
 
Star schema
Star schemaStar schema
Star schema
 
Architecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyArchitecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case Study
 
Oracle: DW Design
Oracle: DW DesignOracle: DW Design
Oracle: DW Design
 
SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...
SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...
SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...
 
business analysis-Data warehousing
business analysis-Data warehousingbusiness analysis-Data warehousing
business analysis-Data warehousing
 
Module 02 teradata basics
Module 02 teradata basicsModule 02 teradata basics
Module 02 teradata basics
 
HANA Modeling
HANA Modeling HANA Modeling
HANA Modeling
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best Practices
 
Sap Business Objects solutioning Framework architecture
Sap Business Objects solutioning Framework architectureSap Business Objects solutioning Framework architecture
Sap Business Objects solutioning Framework architecture
 
SAP BOBJ Rapid Mart Overview & Implementation
SAP BOBJ Rapid Mart Overview & ImplementationSAP BOBJ Rapid Mart Overview & Implementation
SAP BOBJ Rapid Mart Overview & Implementation
 
SAP BOBJ Rapid Marts Overview I
SAP BOBJ Rapid Marts Overview ISAP BOBJ Rapid Marts Overview I
SAP BOBJ Rapid Marts Overview I
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
Aspects of data mart
Aspects of data martAspects of data mart
Aspects of data mart
 
Delta machenism with db connect
Delta machenism with db connectDelta machenism with db connect
Delta machenism with db connect
 
sap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanasap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hana
 

Viewers also liked

Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Andreas Buckenhofer
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieAndreas Buckenhofer
 
Data vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopData vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopjohannesvdb
 
Data Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourData Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourHans Hultgren
 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultDaniel Upton
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Hans Hultgren
 
Data Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneData Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneHans Hultgren
 
Real-life Customer Cases using Data Vault and Data Warehouse Automation
Real-life Customer Cases using Data Vault and Data Warehouse AutomationReal-life Customer Cases using Data Vault and Data Warehouse Automation
Real-life Customer Cases using Data Vault and Data Warehouse AutomationPatrick Van Renterghem
 
Business Intelligence Overview
Business Intelligence OverviewBusiness Intelligence Overview
Business Intelligence Overviewnetpeachteam
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesCGI
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
A First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job PlatformA First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job PlatformSafe Software
 
Caching: In-Memory Column Store oder im BI Server
Caching: In-Memory Column Store oder im BI ServerCaching: In-Memory Column Store oder im BI Server
Caching: In-Memory Column Store oder im BI ServerAndreas Buckenhofer
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Kent Graziano
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse designines beltaief
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationDenodo
 
Agile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingAgile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingDaniel Upton
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process Omid Vahdaty
 

Viewers also liked (20)

Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
 
Lambdaarchitektur für BigData
Lambdaarchitektur für BigDataLambdaarchitektur für BigData
Lambdaarchitektur für BigData
 
Data vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopData vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshop
 
Data Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourData Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part Four
 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data Vault
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011
 
Data Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneData Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part One
 
Real-life Customer Cases using Data Vault and Data Warehouse Automation
Real-life Customer Cases using Data Vault and Data Warehouse AutomationReal-life Customer Cases using Data Vault and Data Warehouse Automation
Real-life Customer Cases using Data Vault and Data Warehouse Automation
 
Business Intelligence Overview
Business Intelligence OverviewBusiness Intelligence Overview
Business Intelligence Overview
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best Practices
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
A First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job PlatformA First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job Platform
 
Caching: In-Memory Column Store oder im BI Server
Caching: In-Memory Column Store oder im BI ServerCaching: In-Memory Column Store oder im BI Server
Caching: In-Memory Column Store oder im BI Server
 
Operational Data Vault
Operational Data VaultOperational Data Vault
Operational Data Vault
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data Virtualization
 
Agile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingAgile BI via Data Vault and Modelstorming
Agile BI via Data Vault and Modelstorming
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 

Similar to Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)

Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentationvickyc
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Data ware house design
Data ware house designData ware house design
Data ware house designSayed Ahmed
 
Data ware house design
Data ware house designData ware house design
Data ware house designSayed Ahmed
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxHong Ong
 
Dynamics ax 2012 development overview
Dynamics ax 2012 development overviewDynamics ax 2012 development overview
Dynamics ax 2012 development overviewAli Raza Zaidi
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Daniel Zivkovic
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Steve Keil
 
Building your first Analysis Services Tabular BI Semantic model with SQL Serv...
Building your first Analysis Services Tabular BI Semantic model with SQL Serv...Building your first Analysis Services Tabular BI Semantic model with SQL Serv...
Building your first Analysis Services Tabular BI Semantic model with SQL Serv...Microsoft TechNet - Belgium and Luxembourg
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsMark Rittman
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Denodo
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenDatabricks
 

Similar to Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW) (20)

Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentation
 
Big Data Modeling
Big Data ModelingBig Data Modeling
Big Data Modeling
 
Sq lite module2
Sq lite module2Sq lite module2
Sq lite module2
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
Excel to Power BI
Excel to Power BIExcel to Power BI
Excel to Power BI
 
Data ware house design
Data ware house designData ware house design
Data ware house design
 
Data ware house design
Data ware house designData ware house design
Data ware house design
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
 
BI Suite Overview
BI Suite OverviewBI Suite Overview
BI Suite Overview
 
Tableau Desktop Material
Tableau Desktop MaterialTableau Desktop Material
Tableau Desktop Material
 
Dynamics ax 2012 development overview
Dynamics ax 2012 development overviewDynamics ax 2012 development overview
Dynamics ax 2012 development overview
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!
 
Data Visualization with Tableau - by Knowledgebee Trainings
Data Visualization with Tableau - by Knowledgebee TrainingsData Visualization with Tableau - by Knowledgebee Trainings
Data Visualization with Tableau - by Knowledgebee Trainings
 
Building your first Analysis Services Tabular BI Semantic model with SQL Serv...
Building your first Analysis Services Tabular BI Semantic model with SQL Serv...Building your first Analysis Services Tabular BI Semantic model with SQL Serv...
Building your first Analysis Services Tabular BI Semantic model with SQL Serv...
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
 

Recently uploaded

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 

Recently uploaded (20)

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 

Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)

  • 1. A company of Daimler AG LECTURE @DHBW: DATA WAREHOUSE PART IV: FRONTEND, METADATA, PROJECTS, ADVANCED TOPICS ANDREAS BUCKENHOFER, DAIMLER TSS
  • 3. DAIMLER TSS. IT EXCELLENCE: COMPREHENSIVE, INNOVATIVE, CLOSE. We're a specialist and strategic business partner for innovative IT Solutions within Daimler – not just another supplier! As a 100% subsidiary of Daimler, we live the culture of excellence and aspire to take an innovative and technological lead. With our outstanding technological and methodical competence we are a competent provider of services that help those who benefit from them to stand out from the competition. When it comes to demanding IT questions we create impetus, especially in the core fields car IT and mobility, information security, analytics, shared services and digital customer experience. Data Warehouse / DHBWDaimler TSS GmbH 3 TSS 2 0 2 0 ALWAYS ON THE MOVE.
  • 4. Daimler TSS GmbH 4 LOCATIONS Data Warehouse / DHBW Daimler TSS China Hub Beijing 6 Employees Daimler TSS Malaysia Hub Kuala Lumpur 38 Employees Daimler TSS India Hub Bangalore 16 Employees Daimler TSS Germany 6 Locations More than1000 Employees Ulm (Headquarters) Stuttgart Area Böblingen, Echterdingen, Leinfelden, Möhringen Berlin
  • 5. • After the end of this lecture you will be able to • Understand function of Frontend Tools and Information Design • Understand the necessity for metadata • Understand lifecycle of DWH projects • Advanced topics like Operational BI, DWH Appliances, Cloud BI WHAT YOU WILL LEARN TODAY Data Warehouse / DHBWDaimler TSS 5
  • 7. LOGICAL STANDARD DATA WAREHOUSE ARCHITECTURE Data Warehouse / DHBWDaimler TSS 7 Data Warehouse FrontendBackend External data sources Internal data sources Staging Layer (Input Layer) OLTP OLTP Core Warehouse Layer (Storage Layer) Mart Layer (Output Layer) (Reporting Layer) Integration Layer (Cleansing Layer) Aggregation Layer Metadata Management Security DWH Manager incl. Monitor
  • 8. • Reporting (Standard, ad-hoc) • OLAP • Dashboards, Scorecards • Advanced Analytics / Data Mining / Text Mining • Search & Discovery INTERFACE TO THE END USER Data Warehouse / DHBWDaimler TSS 8
  • 9. Standard Reports • Prepared static reports that can be executed at request by end users • Are executed at the end of an ETL process and e.g. send by email to end users • Normally based on fact tables and its dimensions • Reports are often lists similar to Excel-Sheets but can also contain graphics (e.g. line charts) Ad-hoc Reports • End users create their own reports („Self service“) REPORTING (STANDARD, AD-HOC) Data Warehouse / DHBWDaimler TSS 9
  • 10. ROLAP / MOLAP Client Frontend • Prepared cubes (multidimensional or relational fact tables) • User can perform interactive analysis of data • Rollup / drill-down • Pivot • Slicing • Dicing OLAP Data Warehouse / DHBWDaimler TSS 10
  • 11. „Progress reports“ Provide an overall view of KPIs (Key Performance Indicators) Combination of several elements from Reporting and/or OLAP (e.g. line charts) into an overall view (like a „cockpit“) Dashboard is more focused on operational goals • High-level overview what is happening Scorecard is more focused on strategic goals • Plan a strategy and identify why something happens DASHBOARDS, SCORECARDS Data Warehouse / DHBWDaimler TSS 11
  • 12. See Mr. Bollinger‘s lecture ADVANCED ANALYTICS / DATA MINING / TEXT MINING Data Warehouse / DHBWDaimler TSS 12
  • 13. Not just numerical data Analysis of new data types gets more and more important • Text • GPS coordinates • Pictures • Videos Data can be available in RDBMS (e.g. text modules/indexes available), Hadoop or SQL DBs SEARCH & DISCOVERY Data Warehouse / DHBWDaimler TSS 13
  • 14. MANY GRAPHICAL ELEMENTS TO USE IN REPORTS Data Warehouse / DHBWDaimler TSS 14 Source: https://github.com/d3/d3/wiki/Gallery
  • 15. MANY GRAPHICAL ELEMENTS … CHAMBER OF HORROR Data Warehouse / DHBWDaimler TSS 15 Source: Hichert / Faisst, http://www.backup-page.hichert.com/
  • 16. Some remarks about previous slide • 3D elements introduce clutter and give not more information • Pie chart most often does not make sense • Line chart barely readable • Labels are placed outside of the graphic • Tachometer costs a lot of space and show • Too much color in general • Color without meaning, e.g. red should be used for alarms / errors MANY GRAPHICAL ELEMENTS … CHAMBER OF HORROR Data Warehouse / DHBWDaimler TSS 16
  • 17. STORY TELLING WITH APPROPRIATE VISUALIZATION Famous example by Hans Rosling (watch 3:08 onwards) https://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen?language=de Data Warehouse / DHBWDaimler TSS 17
  • 18. Information design is the practice of presenting information in a way that fosters efficient and effective understanding of it. (source: Wikipedia, https://en.wikipedia.org/wiki/Information_design ) Some authors are well known for their criticism of many graphical representations - they provide rules for good information design • Edward Tufte • Stephen Few • Rolf Hichert INFORMATION DESIGN Data Warehouse / DHBWDaimler TSS 18
  • 19. Define standards, e.g. • use always the same colors and with care (red = negative, green = positive) • pie charts are rarely useful and should be avoided (better use bar chart or line chart) • No 3D elements as these elements don’t enhance information but introduce clutter INFORMATION DESIGN Data Warehouse / DHBWDaimler TSS 19
  • 20. TABLE WITH INTEGRATED BAR CHARTS Data Warehouse / DHBWDaimler TSS 20 Source: Hichert, http://www.hichert.com/de/resource/table-template-02/
  • 21. Consumers / BI Users • use reports, OLAP and dashboards to obtain information Power Users • Use reports , OLAP and dashboards to obtain information • Create new reports and dashboards Data Scientists • Statistical / mathematical geeks • Analyze / explore data • Need to analyze raw (non-cleansed, non-transformed) data BI END USER ROLES Data Warehouse / DHBWDaimler TSS 21
  • 23. WHAT IS METADATA? Data Warehouse / DHBWDaimler TSS 23 Data about other data
  • 24. Business Metadata • Definition of business vocabulary and relationships • Definition of the value range • Linkage to physical representation TYPES OF METADATA Data Warehouse / DHBWDaimler TSS 24
  • 25. Report metadata • Report definitions • Data sources • Column definitions • Computations Logical and physical metadata of data model • Table structure • Definition of columns • Relationships between tables and columns • Dimension hierarchy TYPES OF METADATA Data Warehouse / DHBWDaimler TSS 25
  • 26. THE AREAS OF METADATA Data Warehouse / DHBWDaimler TSS 26
  • 27. THE AREAS OF METADATA CONNECTED Data Warehouse / DHBWDaimler TSS 27
  • 28. Components of a data warehouse system are interconnected • BI report user has to know • the meaning, definitions of the shown measures, „KPIs“ (key performance indicators) • BI report designer has to know • the table definitions • the meaning of the column values • ETL job designer has to know • the table definitions or the exact definition of the measures • Database administrator has to know • Which tables are used by ETL jobs, reports WHY A COMMON METADATA REPOSITORY? Data Warehouse / DHBWDaimler TSS 28
  • 29. Metadata driven ETL development • Generate parts of ETL code • increasing interest for Data Vault development projects • Tools e.g. MID Innovator, Quipu, AnalytiX DS, Talend, Pentaho, Wherescape, and others Common metadata repository ensures consistency across all components • Many tools involved (DB, ETL, Frontend, …) Enables cross component metadata analysis • Data Lineage • Impact Analysis WHY A COMMON METADATA REPOSITORY? Data Warehouse / DHBWDaimler TSS 29
  • 30. “Data lineage” • Import & Browse Full BI Report Metadata • Navigate through report attributes • Visually navigate through data lineage across tools • Combines operational & design viewpoint WHY A COMMON METADATA REPOSITORY? Data Warehouse / DHBWDaimler TSS 30
  • 31. “Impact Analysis” • Show complete change impact in graphical or list form • Includes impact on reports in BI tools • Visually navigate through impacted objects across tools • Allows impact analysis on any object type WHAT HAPPENS IF I CHANGE THIS COLUMN? Data Warehouse / DHBWDaimler TSS 31
  • 32. • Show relationships between business terms, data model entities, and technical and report fields • Requires cross-tool mapping of business terms • Allows field meaning to be understood • Allows business term relationships to be understood WHAT DOES THIS FIELD MEAN? Data Warehouse / DHBWDaimler TSS 32
  • 33. • Shows objects that user manages • Shows stewardship relationships on business terms • Shows user group associations WHAT OBJECTS DOES THIS USER OWN? Data Warehouse / DHBWDaimler TSS 33
  • 34. • Navigation through complete job details • Navigation of complete operational metadata WHAT HAPPENED ON THE LAST JOB RUN? Data Warehouse / DHBWDaimler TSS 34
  • 36. LOGICAL STANDARD DATA WAREHOUSE ARCHITECTURE Data Warehouse / DHBWDaimler TSS 36 Data Warehouse FrontendBackend External data sources Internal data sources Staging Layer (Input Layer) OLTP OLTP Core Warehouse Layer (Storage Layer) Mart Layer (Output Layer) (Reporting Layer) Integration Layer (Cleansing Layer) Aggregation Layer Metadata Management Security DWH Manager incl. Monitor Top Down (Inmon) Bottom Up (Kimball)
  • 37. Top-Down (Inmon) • Comprehensive approach regarding available data • Design Core Warehouse Layer = integrated data model first considering all requirements • Design data marts afterwards Bottom-Up (Kimball) • Approach focusing on fast delivery of first results • Design one data mart first • Next Marts are modeled afterwards usually using Kimball architecture • conformed dimensions to integrate different data marts / fact tables TOP-DOWN VS BOTTOM-UP APPROACH Data Warehouse / DHBWDaimler TSS 37
  • 38. TOP-DOWN VS BOTTOM-UP APPROACH ADVANTAGES AND DISADVANTAGES Data Warehouse / DHBWDaimler TSS 38 Top-Down (Inmon) Bottom-Up (Kimball) ☺ Core Warehouse Layer is designed optimal ☺ Early involvement of end users ☺ Data from Core Warehouse Layer is reused in many Marts ☺ Fast results Time-consuming approach with high preparatory effort Focus on single Marts leads to risk that overall view is lost, esp. properly designed Core Warehouse Layer High risk with changing requirements Data often not reused but inconsistently copied across Marts
  • 39. Both approaches have their down-sides • Top-Down takes enormous initial effort to build data model for Core Warehouse Layer • Bottom-Up is risky as central / integrated focus is lost Think big, start small • Think Big: Design conceptual data model for Core Warehouse Layer covering whole enterprise • Start small: Implement physical data model for Core and Mart Layer in iterations by each business department THINK BIG, START LOCAL Data Warehouse / DHBWDaimler TSS 39
  • 40. • Classical • Waterfall model • Incremental model • Agile • Scrum • Kanban PROCESS MODEL Data Warehouse / DHBWDaimler TSS 40
  • 41. • DWH is not a system or product • DWH databases are more complex with different layers and data models • Data first, code is secondary • Data quality is a major concern • Data integration is a challenging objective • Business need difficult to justify quantitatively WHAT’S DIFFERENT IN DWH PROJECTS? Data Warehouse / DHBWDaimler TSS 41
  • 42. WHY DO DWH PROJECTS FAIL? Data Warehouse / DHBWDaimler TSS 42
  • 43. Feasibility study Analysis Design Implementation Test Operations and maintenance PROJECT PHASES Data Warehouse / DHBWDaimler TSS 43
  • 44. • Benefits of DWH • Cost-effectiveness • SW selection • HW selection • Staff requirement including external Know-How • Data protection and data security agreement, data classification • Proof of Concept (PoC) to challenge different possible solutions • Architectural concept FEASIBILITY STUDY Data Warehouse / DHBWDaimler TSS 44
  • 45. • Documentation of user requirements: specification sheet • Backend including ETL • Frontend • Security • Metadata, Business glossary • Non-functional requirements • Analysis of data sources • Data quality • Data models • Data security ANALYSIS Data Warehouse / DHBWDaimler TSS 45
  • 46. • Technical description how to implement specifications • Data model for different DWH layers • Data integration design • Frontend design • Security concept • Capacity planning DESIGN Data Warehouse / DHBWDaimler TSS 46
  • 47. • Installation of development, test, integration, production, maintenance environment • Usage of Metadata repository for implementation of data model, etl, frontend, security • Launch of DWH • Release Management IMPLEMENTATION Data Warehouse / DHBWDaimler TSS 47
  • 48. • Functional • Data quality / data validation • Usability • Performance • Operational • Security TEST Data Warehouse / DHBWDaimler TSS 48
  • 49. • Deployment of new features, changes or bug fixes • End user training • Monitoring • Production concept • Initial load and future delta loads • Keep the system running OPERATIONS AND MAINTENANCE Data Warehouse / DHBWDaimler TSS 49
  • 50. Organizational team that coordinate and standardize DWH activities within an (end user) organization • Define standards and create BI portfolio (e.g. which tools/products to use) • Create DWH architecture and govern BI activities • Establish processes for business and IT interaction • Monitor DWH/BI market for new trends • Determine skills and experience of Business users BICC: BI CENTER OF EXCELLENCE Data Warehouse / DHBWDaimler TSS 50
  • 51. Define 3-5 criteria for the evaluation of an ETL tool How does a relational DBMS (like Oracle, DB2, MS SQL Server) meet these requirements? EXERCISE Data Warehouse / DHBWDaimler TSS 51
  • 52. • Supplier profile • Support • HW/SW requirements • License / maintenance Costs • Usability • Reliability • Performance and scalability • Multi-tenant • Interfaces • Scheduling EXERCISE - DEFINE 5 CRITERIA FOR THE EVALUATION OF AN ETL TOOL Data Warehouse / DHBWDaimler TSS 52
  • 53. • RDBMS provide many of the functionalities but additional programming required • RDBMS are often used for ETL/ELT by programming with SQL, PL/SQL, SQLT, etc EXERCISE - HOW DOES A RELATIONAL DBMS MEET THESE REQUIREMENTS? Data Warehouse / DHBWDaimler TSS 53 ETL Tool Manual ETL Informatica, Talend, Oracle ODI, etc. SQL, PL/SQL, SQLT, etc. Separate license No additional license Workflow, error handling, and restart/recovery functionality included Workflow, error handling, and restart/recovery functionality must be implemented manually Impact analysis and where-used (lineage) functionality available Impact analysis and where-used (lineage) functionality difficult Faster development, easier maintenance Slower development, more difficult maintenance Additional (Tool-) Know How required Know How often available
  • 54. NEWER / ADVANCED TOPICS • OPERATIONAL DATA WAREHOUSING • DATA WAREHOUSE APPLIANCES • CLOUD BI
  • 55. Classical“ Data Warehouses • Information in the warehouse used to support strategic business decisions • Kept separate from operational systems • Load of new data only in larger intervals (mostly weekly or monthly) • Shorter intervals not required by users • Huge system resources of the ETL process Near Real Time Operational Data Warehousing • Information in the warehouse used for tactical business decisions as well • Low latency of information in data warehouse therefore needed • Not only mathematical aggregations OPERATIONAL DATA WAREHOUSING Data Warehouse / DHBWDaimler TSS 55
  • 56. With classical data warehouses users have to access two types of systems to get a complete image of a customer (for instance for CRM applications or in call centers) • the data warehouse to see what happened in the past • the OLTP systems to get the most current information With an operational data warehouse • all this information is in one system • tighter integration with operational systems is easier • for instance personalized offers „closing the loop“ WHY OPERATIONAL DATA WAREHOUSING? Data Warehouse / DHBWDaimler TSS 56
  • 57. SMARTFACTORY: OPERATIONAL BI SERVICE PLATFORM Data Warehouse / DHBWDaimler TSS 57 Source: Gluchowski: Analytische Informationssysteme, 5.Aufl., p. 279 Workers getting alarms on their watch Containing and displaying complex manuals, e.g. during repair New data sources sending lots of data with high speed (sensor data, logs, etc.) Right-Time data required for automated actions, e.g. cordless screwdriver knows and adjusts torque
  • 58. Near Real time /Right time ETL • Executed asynchronously; triggered by business transactions in the OLTP • Incremental real-time load • Tighter integration of operational and data warehouse systems DWHs become „mission critical“ • Higher requirements on availability and performance Higher „transactional“ system load on data warehouse system • DWH DB has to deal with typical DWH system load and transactional load Data Quality mandatory • Data is used for automated decisions CHALLENGES FOR OPERATIONAL DATA WAREHOUSING Data Warehouse / DHBWDaimler TSS 58
  • 59. COMPARISON CLASSICAL DWH – OPERATIONAL DWH Data Warehouse / DHBWDaimler TSS 59 Classical DWH Operational DWH Strategic • Passive • Historical trends Tactical • Prediction • Automatic execution of decisions Batch • E.g. daily batch Near Real-Time / Right-Time • Up-to-date view Lower Availability • System can be down for maintenance and longer response times for some reports are accepted Availability • System becomes critical and must fulfill high availability and performance requirements
  • 60. Setting up and configuring a data warehouse system is a complex task • Hardware • Servers + Storage + Network • Connectivity to source systems • Software • Database management system • ETL software • Reporting and analytics software • ... An optimal performance of the whole system is difficult to achieve DATA WAREHOUSE APPLIANCES Data Warehouse / DHBWDaimler TSS 60
  • 61. Data Warehouse Appliances are • Pre-configured and pre-tested hard- and software configurations developed for running a data warehouse • Optimized for data warehousing workload / Only suited for running OLAP • In contrast one size fits all: RDBMS are suited for OLTP, OLAP and mixed workloads • Ready to be used after they are delivered to the customer • Products, e.g. Teradata, HP Vertica, Exasol, Oracle Exadata, IBM Netezza (IBM PureData System for Analytics), MS Analytic Platform System DATA WAREHOUSE APPLIANCES Data Warehouse / DHBWDaimler TSS 61
  • 62. APPLIANCE SIMPLICITY (E.G. ORACLE EXADATA) Data Warehouse / DHBWDaimler TSS 62
  • 63. • Move as many operations as possible to storage cell instead of moving data to the DB server • E.g. filter data already at storage cell and not at DB server • Avoid transferring unnecessary data • Column-oriented In-memory storage with high compression • Many appliances are based on shared nothing architecture • Each node is independent • Each node has its own storage or memory • Parallel processing simpler and faster as no overhead due to contention TYPICAL ENHANCEMENTS Data Warehouse / DHBWDaimler TSS 63
  • 64. • BI applications (database, ETL tools, Frontend) are hosted in a public cloud, e.g. • AWS (Amazon Web Services) • Microsoft Azure • … • Many tools nowadays are available in the cloud first • Vendors try to force customers to use clouds • Or even available in the cloud only • E.g. Microsoft Power BI CLOUD BI Data Warehouse / DHBWDaimler TSS 64
  • 65. • Security concerns for sensitive data • But new data source coming from Internet. Storing the data in a (public) cloud can make sense, e.g. • Connected Cars, IOT in general CLOUD BI Data Warehouse / DHBWDaimler TSS 65
  • 66. CLOUD BI ARCHITECTURE Data Warehouse / DHBWDaimler TSS 66 Source: Lang: Business Intelligence erfolgreich umsetzen, 5.Aufl., p. 185
  • 67. • Analytics as a service • Provide complete BI (Analytics) SW stack including • data storage • data integration (ETL) • data visualization and/or data modeling (Frontend) • Meta data management • Data as a service • Provide quality data for further usage • Data marketplace CLOUD BI ARCHITECTURE Data Warehouse / DHBWDaimler TSS 67
  • 68. CLOUD BI – DATA WAREHOUSING SERVICES Data Warehouse / DHBWDaimler TSS 68 Source: http://db-engines.com/en/system/Amazon+Redshift%3BSnowflake
  • 69. SNOWFLAKE ARCHITECTURE Data Warehouse / DHBWDaimler TSS 69 Don‘t confuse Snowflake product with Snowflake dimensional model from session 2
  • 70. Snowflake Storage • Snowflake loads data into its internal optimized, compressed, columnar format • Snowflake itself uses (!) Amazon Web Service’s S3 (Simple Storage Service) cloud storage Query Processing • Each virtual warehouse is an MPP (Multi Parallel Processing) compute cluster composed of multiple compute nodes allocated by Snowflake from Amazon EC2 • Each virtual warehouse is an independent compute cluster that does not share compute resources with other virtual warehouses SNOWFLAKE ARCHITECTURE Data Warehouse / DHBWDaimler TSS 70
  • 71. Cloud Services • Authentication and access control • Infrastructure management • Metadata management • Query parsing and optimization • Security SNOWFLAKE ARCHITECTURE Data Warehouse / DHBWDaimler TSS 71
  • 72. • Kimball et al: The Data Warehouse Lifecycle Toolkit, Wiley 2008 • Bauer / Günzel: Data-Warehouse-Systeme, dpunkt, 2013 • Köppen et al: Data Warehouse Technologien, mitp, 2016 END OF LECTURE GOOD TEXT BOOKS Data Warehouse / DHBWDaimler TSS 72 Source: http://dilbert.com/strip/2014-05-07
  • 73. Daimler TSS GmbH Wilhelm-Runge-Straße 11, 89081 Ulm / Telefon +49 731 505-06 / Fax +49 731 505-65 99 tss@daimler.com / Internet: www.daimler-tss.com/ Intranet-Portal-Code: @TSS Domicile and Court of Registry: Ulm / HRB-Nr.: 3844 / Management: Christoph Röger (CEO), Steffen Bäuerle Data Warehouse / DHBWDaimler TSS 73 THANK YOU
  • 74. How to document / identify requirements? • Must be easy to understand from non-technical users during Analysis/Technical concept phase • Must provide sufficient information for System Design phase The following slides provide some example work products that are produced during Analysis/Technical concept phase and may be refine during System Design phase POSSIBLE DWH ANALYSIS AND DESIGN WORK PRODUCTS Data Warehouse / DHBWDaimler TSS 74
  • 75. • Answer most important questions of participating business units • Provide high-quality data • Introduction in time • Usage of modern technology • Business orientation • Easy to use • Executive sponsor • Patience – user acceptance evolves over time CRITICAL SUCCESS FACTORS FOR BUILDING A DATA WAREHOUSE Data Warehouse / DHBWDaimler TSS 75
  • 76. • New applications and data sources • Increase demand for an • Operational DWH, e.g. • Industry 4.0 / Smart Factory • Internet Of Things • Internet of medical things • Connected Cars EXAMPLES OF OPERATIONAL DATA WAREHOUSING Data Warehouse / DHBWDaimler TSS 76 Source: Gluchowski: Analytische Informationssysteme, 5.Aufl., p. 277 Replace pen & paper with electronic workflows Decision support for each end user and not only management Increasing demand to publish same content on different devices

Editor's Notes

  1. Mission: Wir sind Spezialist und strategischer Business-Partner für innovative IT-Gesamtlösungen im Daimler-Konzern – not just another supplier! more than another supplier!    
  2. Small iterations Waterfall approach taking 8-12 months or longer often fails or does not deliver in time Always think about how to achieve flexible data integration in Core Warehouse Layer Data Marts can be dropped and reloaded from Data in the Core Warehouse Layer Dropping the Core Warehouse Layer not possible. Data loss (history)
  3. Extreme Scoping (TDWI; Larissa Moss) Scrum: Time-boxing. Der Kern ist ein fester Zeitabschnitt von maximal vier Wochen, Sprint genannt. Kanban: Ziel ist es, die "Aufgaben in Arbeit" (Work-Items in Progress; WIP) zu optimieren. Die Projektbeteiligten messen die Zeit, die eine Aufgabe braucht, bis sie "fertig" ist, identifizieren "Bottlenecks", steuern das Projekt mit der WIP-Begrenzung und passen es an einen möglichst optimierten "Flow" an. 
  4. Wirtschaftlichkeit
  5. Torque Drehmoment
  6. Data Quality Automation: Bsp Bosch Werkstätten
  7. Als Platform as a Service (PaaS) bezeichnet man eine Dienstleistung, die in der Cloud eine Computer-Plattform für Entwickler von Webanwendungen zur Verfügung stellt. Dabei kann es sich sowohl um schnell einsetzbare Laufzeitumgebungen (typischerweise für Webanwendungen) aber auch um Entwicklungsumgebungen handeln, die mit geringem administrativen Aufwand und ohne Anschaffung der darunterliegenden Hardware und Software genutzt werden können.  Das SaaS-Modell basiert auf dem Grundsatz, dass die Software und die IT-Infrastruktur bei einem externen IT-Dienstleister betrieben und vom Kunden als Dienstleistung genutzt werden Business Process as a Service (BPaaS) ist das Outsourcing von Geschäftsprozessen in eine Cloud. Mit dem Business Process Outsourcing (BPO) sollen die Kosten für die Entwicklung, die Automatisierung und Prozesssteuerung reduziert werden, da diese beim Cloud-Computing als monatliche Fixkosten anfallen.