Part 3(4)
The slides contain a DWH lecture given for students in 5th semester. Content:
- Introduction DWH and Business Intelligence
- DWH architecture
- DWH project phases
- Logical DWH Data Model
- Multidimensional data modeling
- Data import strategies / data integration / ETL
- Frontend: Reporting and anaylsis, information design
- OLAP
Part 2 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
Part 2(4)
The slides contain a DWH lecture given for students in 5th semester. Content:
- Introduction DWH and Business Intelligence
- DWH architecture
- DWH project phases
- Logical DWH Data Model
- Multidimensional data modeling
- Data import strategies / data integration / ETL
- Frontend: Reporting and anaylsis, information design
- OLAP
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
Part 4(4)
The slides contain a DWH lecture given for students in 5th semester. Content:
- Introduction DWH and Business Intelligence
- DWH architecture
- DWH project phases
- Logical DWH Data Model
- Multidimensional data modeling
- Data import strategies / data integration / ETL
- Frontend: Reporting and anaylsis, information design
- OLAP
Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
Part 1(4)
The slides contain a DWH lecture given for students in 5th semester. Content:
- Introduction DWH and Business Intelligence
- DWH architecture
- DWH project phases
- Logical DWH Data Model
- Multidimensional data modeling
- Data import strategies / data integration / ETL
- Frontend: Reporting and anaylsis, information design
- OLAP
The presentation compares Data Lakes with classical DWHs. Topics like schema-on-read, schema-on-write, security, JSON, data modeling, data integration are covered.
Part 2 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
Part 2(4)
The slides contain a DWH lecture given for students in 5th semester. Content:
- Introduction DWH and Business Intelligence
- DWH architecture
- DWH project phases
- Logical DWH Data Model
- Multidimensional data modeling
- Data import strategies / data integration / ETL
- Frontend: Reporting and anaylsis, information design
- OLAP
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
Part 4(4)
The slides contain a DWH lecture given for students in 5th semester. Content:
- Introduction DWH and Business Intelligence
- DWH architecture
- DWH project phases
- Logical DWH Data Model
- Multidimensional data modeling
- Data import strategies / data integration / ETL
- Frontend: Reporting and anaylsis, information design
- OLAP
Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
Part 1(4)
The slides contain a DWH lecture given for students in 5th semester. Content:
- Introduction DWH and Business Intelligence
- DWH architecture
- DWH project phases
- Logical DWH Data Model
- Multidimensional data modeling
- Data import strategies / data integration / ETL
- Frontend: Reporting and anaylsis, information design
- OLAP
The presentation compares Data Lakes with classical DWHs. Topics like schema-on-read, schema-on-write, security, JSON, data modeling, data integration are covered.
Designing high performance datawarehouseUday Kothari
Just when the world of “Data 1.0” showed some signs of maturing; the “Outside In” driven demands seem to have already initiated some the disruptive changes to the data landscape. Parallel growth in volume, velocity and variety of data coupled with incessant war on finding newer insights and value from data has posed a Big Question: Is Your Data Warehouse Relevant?
In short, the surrounding changes happening real time is the new “Data 2.0”. It is characterized by feeding the ever hungry minds with sharper insights whether it is related to regulation, finance, corporate action, risk management or purely aimed at improving operational efficiencies. The source in this new “Data 2.0” has to be commensurate to the outside in demands from customers, regulators, stakeholders and business users; and hence, you would need a high relformance (relevance + performance) data warehouse which will be relevant to your business eco-system and will have the power to scale exponentially.
We starts this webinar by giving the audiences a sneak preview of what happened in the Data 1.0 world & which characteristics are shaping the new Data 2.0 world. It then delves deep on the challenges that growing data volumes have posed to the Data warehouse teams. It also presents the audiences some of the practical and proven methodologies to address these performance challenges. Finally, in the end it will highlight some of the thought provoking ways to turbo charge your data warehouse related initiatives by leveraging some of the newer technologies like Hadoop. Overall, the webinar will educate audiences with building high performance and relevant data warehouses which is capable of meeting the newer demands while significantly driving down the total cost of ownership.
After completing this module, you will be able to:
List and describe the major components of the Teradata architecture.
Describe how the components interact to manage incoming and outgoing data.
List 5 types of Teradata database objects.
These slides will help in understanding what is Data warehouse? why we need it? DWh architecture, OLAP, Metadata, Data Mart, Schemas for multidimensional data, partitioning of data warehouse
This document describes the overview of SAP BusinessObjects Rapid Marts, available Rapid Mart
packages, how Rapid Mart packages helps and accelerates in Data Warehouse implementation process
The integration of SAS and Tableau can have significant business benefits. SAS and Tableau are ‘best of breed’ in their own areas: SAS in the area of Analytics and ‘Analytical Data Preparation’; Tableau in the area of data discovery, visualization and intuitive, interactive dashboarding. Consequently, it makes sense to find ways to combine these technologies to deliver an Integrated Information Framework which leverages the strengths of both solutions.
SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...Edureka!
This SAS Programming For Beginners tutorial from Edureka will take you through the programming concepts in SAS such as data and procedure steps, formats, informats, loops, dataset operations and important procedures like Proc Means, Frequency, Summary and many more. We have implemented a Randomness Testing demo which uses SAS Frequency procedure and Chi Square test to check the randomness of a given sample of data. Below are the topics covered in this tutorial:
1. Data Analytics Tools
2. Why SAS?
3. What is SAS?
4. SAS Features
5. Programming Concepts in SAS
6. Use Case – Testing Randomness
7. SAS Job Trends
Präsentation auf der DOAG Konferenz
Metadaten sind ein häufig vernachlässigtes Thema, da Metadaten als langweilig betrachtet oder auch nicht bewusst wahr genommen werden. Auch die eher abstrakten Beschreibungen wie "Metadaten sind Daten über Daten" sind nicht gerade hilfreich.
In der Präsentation werden die verschiedenen Arten von Metadaten (fachlich, technisch, prozessual) besprochen. Es wird darauf eingegangen, wie diese in einem Data Vault Projekt genutzt wurden, um z.B. Vorgaben festzulegen oder Code zu generieren.
Designing high performance datawarehouseUday Kothari
Just when the world of “Data 1.0” showed some signs of maturing; the “Outside In” driven demands seem to have already initiated some the disruptive changes to the data landscape. Parallel growth in volume, velocity and variety of data coupled with incessant war on finding newer insights and value from data has posed a Big Question: Is Your Data Warehouse Relevant?
In short, the surrounding changes happening real time is the new “Data 2.0”. It is characterized by feeding the ever hungry minds with sharper insights whether it is related to regulation, finance, corporate action, risk management or purely aimed at improving operational efficiencies. The source in this new “Data 2.0” has to be commensurate to the outside in demands from customers, regulators, stakeholders and business users; and hence, you would need a high relformance (relevance + performance) data warehouse which will be relevant to your business eco-system and will have the power to scale exponentially.
We starts this webinar by giving the audiences a sneak preview of what happened in the Data 1.0 world & which characteristics are shaping the new Data 2.0 world. It then delves deep on the challenges that growing data volumes have posed to the Data warehouse teams. It also presents the audiences some of the practical and proven methodologies to address these performance challenges. Finally, in the end it will highlight some of the thought provoking ways to turbo charge your data warehouse related initiatives by leveraging some of the newer technologies like Hadoop. Overall, the webinar will educate audiences with building high performance and relevant data warehouses which is capable of meeting the newer demands while significantly driving down the total cost of ownership.
After completing this module, you will be able to:
List and describe the major components of the Teradata architecture.
Describe how the components interact to manage incoming and outgoing data.
List 5 types of Teradata database objects.
These slides will help in understanding what is Data warehouse? why we need it? DWh architecture, OLAP, Metadata, Data Mart, Schemas for multidimensional data, partitioning of data warehouse
This document describes the overview of SAP BusinessObjects Rapid Marts, available Rapid Mart
packages, how Rapid Mart packages helps and accelerates in Data Warehouse implementation process
The integration of SAS and Tableau can have significant business benefits. SAS and Tableau are ‘best of breed’ in their own areas: SAS in the area of Analytics and ‘Analytical Data Preparation’; Tableau in the area of data discovery, visualization and intuitive, interactive dashboarding. Consequently, it makes sense to find ways to combine these technologies to deliver an Integrated Information Framework which leverages the strengths of both solutions.
SAS Programming For Beginners | SAS Programming Tutorial | SAS Tutorial | SAS...Edureka!
This SAS Programming For Beginners tutorial from Edureka will take you through the programming concepts in SAS such as data and procedure steps, formats, informats, loops, dataset operations and important procedures like Proc Means, Frequency, Summary and many more. We have implemented a Randomness Testing demo which uses SAS Frequency procedure and Chi Square test to check the randomness of a given sample of data. Below are the topics covered in this tutorial:
1. Data Analytics Tools
2. Why SAS?
3. What is SAS?
4. SAS Features
5. Programming Concepts in SAS
6. Use Case – Testing Randomness
7. SAS Job Trends
Präsentation auf der DOAG Konferenz
Metadaten sind ein häufig vernachlässigtes Thema, da Metadaten als langweilig betrachtet oder auch nicht bewusst wahr genommen werden. Auch die eher abstrakten Beschreibungen wie "Metadaten sind Daten über Daten" sind nicht gerade hilfreich.
In der Präsentation werden die verschiedenen Arten von Metadaten (fachlich, technisch, prozessual) besprochen. Es wird darauf eingegangen, wie diese in einem Data Vault Projekt genutzt wurden, um z.B. Vorgaben festzulegen oder Code zu generieren.
The purpose of business intelligence is to support better business decision making. BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a data warehouse or a data mart and occasionally working from operational data.
One of my old presentation to our management covers the following topics
History and Milestones
Traditional Data Warehouse
Key trends breaking the traditional data warehouse
Modern Data Warehouse
Multiple parallel processing (MPP) architecture
Hadoop Ecosystem
Technical Innovation on Hadoop
A First Look at San Francisco’s New ETL Job PlatformSafe Software
One of the strategies to achieve the City and County of San Francisco’s goal of increasing the number and timeliness of datasets on the city’s official open data portal (SF OpenData) is to “develop our program to automate the publication of data”. Toward that end, the team’s technical staff have designed and deployed an ETL job platform which prominently features FME technology. This talk will highlight San Francisco’s historic use of FME, the impetus for improving its ETL job platform, the design and architecture of this new platform, and some thoughts about the platform’s future. This discussion will be of most interest to those attendees whose organizations are considering whether to undertake an enterprise-level effort to automate the publication of its data to an open data portal.
I gave this presentation at the Advanced Architecture Conference, Bill Inmon, 2011 in Evergreen, Colorado. This presentation covers a new breed of data warehousing called Operational Data Warehousing. These are the next steps in business intelligence towards self-service BI and enabling users to do more with their enterprise data warehouse solution. Specifically, it talks about how the Data Vault model fits in to this picture.
If you would like to use the slides, please e-mail me first, I'd be happy to discuss it with you.
Supporting Data Services Marketplace using Data VirtualizationDenodo
Data is treated truly as an asset at Guardian Life. We have created a Data Services Marketplace which contains valuable data from the underlying sources and is used by business users for day-to-day operations. In this presentation, you will see how Data Virtualization can be used to support the marketplace with real-time data services, provision non real-time data into Hadoop, and swap underlying sources without effecting business users.
This presentation is part of the Fast Data Strategy Conference, and you can watch the video here goo.gl/PZ2uFj.
Ezwim is the European market leader for Telecom Management Technology. Ezwim was founded on a simple but powerful idea: organizations should have complete control over their mobile and fixed communications assets and costs, with the ability to administer their telecom infrastructure seamlessly and globally. Ezwim empowers customers to implement their telecom innovation strategies easily, globally and independently, while keeping costs under control. Ezwim maintains a market leader position by delivering end-user centric, intuitive technology and services and has been recently recognized by Gartner by providing the highest levels of customer satisfaction across the industry
Our solutions are provided in over 44 countries and available for over 150 carriers. Ranging from Telecom Management applications to Telecom business process outsourcing we can deliver wherever you require. Ezwim is a privately held company with headquarters in Amsterdam, The Netherlands and has presence in Germany, Belgium, France, United Kingdom and the USA.
ETL stands for extract, transform, and load and is a traditionally accepted way for organizations to combine data from multiple systems into a single database, data store, data warehouse, or data lake.
We compare the traditional ETL approach to the newer Business Rules-driven E-LT paradigm, the answer whether conventional ETL tools should be considered obsolete and phased out of the Enterprise Architecture, and tools based on Business Rules and E-LT take their place.
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
You can gain substantially more business insights and save costs by migrating your existing data warehouse to Amazon Redshift. This session will cover the key benefits of migrating to Amazon Redshift, migration strategies, and tools and resources that can help you in the process. We’ll learn about AWS Database Migration Service and AWS Schema Migration Tool, which were recently enhanced to import data from six common data warehouse platforms.
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
You can gain substantially more business insights and save costs by migrating your existing data warehouse to Amazon Redshift. This session will cover the key benefits of migrating to Amazon Redshift, migration strategies, and tools and resources that can help you in the process.
Test labs 2016. Тестирование data warehouse Sasha Soleev
1. Введение.
2. Основные понятия и принципы работы DWH.
3. Тестирование DWH. С чего начать?
4. SQL(DDL, DML, DCL) и их использование в тестировании.
5. Tips and tricks.
6. QA.
Автор: Юрий Слива
Finding ways to make ETL loads faster is not always obvious. Moreover, there is a difference in how to tune OLAP vs OLTP databases. Some of the techniques learned through years of tuning EBS seem to make no effect on tuning a BI ETL. This presentation will discuss why this is the case, present some techniques on how to find the bottlenecks in your BI ETL jobs and some techniques to tune these slow SQL statements, improving the speed of nightly ETL jobs. Attendees will learn the steps to monitor ETLs, capture Problem SQL and gain knowledge to improve the overall ETL Performance.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
3. DAIMLER TSS. IT EXCELLENCE: COMPREHENSIVE, INNOVATIVE, CLOSE.
We're a specialist and strategic business partner for innovative IT Solutions within Daimler –
not just another supplier!
As a 100% subsidiary of Daimler, we live the culture of excellence and aspire to take an
innovative and technological lead.
With our outstanding technological and methodical competence we are a competent provider of
services that help those who benefit from them to stand out from the competition. When it
comes to demanding IT questions we create impetus, especially in the core fields car IT and
mobility, information security, analytics, shared services and digital customer
experience.
Data Warehouse / DHBWDaimler TSS GmbH 3
TSS 2 0 2 0 ALWAYS ON THE MOVE.
4. Daimler TSS GmbH 4
LOCATIONS
Data Warehouse / DHBW
Daimler TSS China
Hub Beijing
6 Employees
Daimler TSS Malaysia
Hub Kuala Lumpur
38 Employees
Daimler TSS India
Hub Bangalore
16 Employees
Daimler TSS Germany
6 Locations
More than1000 Employees
Ulm (Headquarters)
Stuttgart Area
Böblingen, Echterdingen,
Leinfelden, Möhringen
Berlin
5. After the end of this lecture you will be able to
Understand concepts behind ETL
WHAT YOU WILL LEARN TODAY
Data Warehouse / DHBWDaimler TSS 5
6. LOGICAL STANDARD DATA WAREHOUSE ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 6
Data Warehouse
FrontendBackend
External data sources
Internal data sources
Staging
Layer
(Input
Layer)
OLTP
OLTP
Core
Warehouse
Layer
(Storage
Layer)
Mart Layer
(Output
Layer)
(Reporting
Layer)
Integration
Layer
(Cleansing
Layer)
Aggregation
Layer
Metadata Management
Security
DWH Manager incl. Monitor
? ? ? ?
7. Extract – Transform - Load
Other term: Data integration (better, more neutral)
ETL PROCESS
Data Warehouse / DHBWDaimler TSS 7
8. • capture and copy data from source systems (e.g. operational systems)
• many different types of sources
• Relational, hierarchical DBMSs
• Flat files
• Other internal/external sources
TASKS OF THE ETL PROCESS - EXTRACT
Data Warehouse / DHBWDaimler TSS 8
9. • Filter data
• Integrate data
• Check and cleanse data
TASKS OF THE ETL PROCESS - TRANSFORM
Data Warehouse / DHBWDaimler TSS 9
10. • Original meaning: Fast load into staging area
• General meaning: Loading data into staging area or another layer
TASKS OF THE ETL PROCESS - LOAD
Data Warehouse / DHBWDaimler TSS 10
11. ETL often used for data integration in general (for ETL and ELT)
But: if ELT is mentioned, it is differentiated from ETL
ETL VS ELT
Data Warehouse / DHBWDaimler TSS 11
Source
DB
Target
DB
ETL Server
Source
DB
Target
DB
ELT Server
Data flow
12. ETL VS ELT
Data Warehouse / DHBWDaimler TSS 12
ETL ELT
Data is transferred to ETL server and transferred back to
DB. High network bandwidth required
Data remains in the DB except for cross Database loads
(e.g. source to target)
Transformations are performed in the ETL Server Transformations are performed (in the source or) in the
target
Proprietary code is executed in the ETL server Generated code, e.g. SQL, PL/SQL, SQLT
Typically used for
• source to target transfer
• Compute intensive transformations
• Small amount of data
Typically used for
• High amounts of data
13. ETL/ELT TOOL VS MANUAL ETL/ELT
Data Warehouse / DHBWDaimler TSS 13
ETL Tool Manual ETL
Informatica, Talend, Oracle ODI, etc. SQL, PL/SQL, SQLT, etc.
Separate license No additional license
Workflow, error handling, and restart/recovery
functionality included
Workflow, error handling, and restart/recovery
functionality must be implemented manually
Impact analysis and where-used (lineage) functionality
available
Impact analysis and where-used (lineage) functionality
difficult
Faster development, easier maintenance Slower development, more difficult maintenance
Additional (Tool-) Know How required Know How often available
14. ETL/ELT TOOL VS MANUAL ETL/ELT
Data Warehouse / DHBWDaimler TSS 14
Extract services
Load
services
Operations management services
Scheduler Control Repository Management
Connectors
Sorter
Connector
Sorter
Bulk Loader
Data Profiling servicesSource analysis
Data Quality servicesData cleansing
Data Transformation and Integration services
Data mapping Business rules
Slowly Changing Dimensions
Datatype conversion
Lookups
Job Monitoring Auditing Error Handling
Security
19. Extracts from source systems
Initial extract for setting up the data warehouse
• Initial Load
Periodical extracts for adding new/changed information to the data
warehouse
• Incremental Load
Question: How to determine what is new or what has changed in the source
systems?
Task of „monitoring“
MONITORING (DATA CHANGE DETECTION)
Data Warehouse / DHBWDaimler TSS 19
20. Discovery of all changes vs. determining the net effect at extract/load
time only
• Example: an attribute value can be changed in two ways:
• by one update operation
• by one delete and one insert operation
The net effect of both is the same
However, history information is lost if the net effect is recorded only
MONITORING: NET EFFECT OF CHANGES
Data Warehouse / DHBWDaimler TSS 20
21. Which techniques can be used to identify changes in a source system
(RDBMS)?
• E.g. in OLTP system
• new products are inserted
• customer address changes
• Product is deleted because it is out of stock
How would you identify such changes? List advantages / disadvantages of
possible solutions
Think about making changes in the source system. Think also about other
solutions without any change in the source system.
EXERCISE MONITORING
Data Warehouse / DHBWDaimler TSS 21
22. Depend on characteristics of the data sources
The following techniques are based on modern relational DBMS
Types of techniques
Based on DBMS
• Trigger-based
• Log-based discovery
• Replication techniques
Controlled by application
• Timestamp-based discovery
• Snapshot-based discovery
MONITORING TECHNIQUES
Data Warehouse / DHBWDaimler TSS 22
23. Active monitoring mechanisms
Based on (database) triggers
• Example:
• If new record is inserted in sales transaction table then insert transaction id and
timestamp in change table
Advantage:
• Triggers do not change operational applications
Disadvantage:
• Performance impact on operation systems if triggers are used extensively
• Triggers have to be implemented for every table in the source systems
TRIGGER-BASED
Data Warehouse / DHBWDaimler TSS 23
24. Sample Trigger Code, Oracle
CREATE [OR REPLACE] TRIGGER <trigger_name>
{BEFORE|AFTER} {INSERT|DELETE|UPDATE}
ON <table_name>
[REFERENCING [NEW AS <new_row_name>] [OLD AS <old_row_name>]]
[FOR EACH ROW [WHEN (<trigger_condition>)]]
<trigger_body>
Trigger is created for each source table in OLTP DB and stores
insert/update/delete changes in a “log/journal table”
• trigger body contains insert statements into log/journal table
TRIGGER-BASED
Data Warehouse / DHBWDaimler TSS 24
25. Log-based discovery
Also often referenced as CDC (Change Data Capture)
Usage of database transaction logs to determine changes
• DBMSs write transaction logs in order to be able to undo partially executed
transactions
• This information can be used to determine all changes
• Log reader identifies insert, update, delete, truncates and writes the changes as
inserts into staging layer
Transaction Log files can be transferred to other systems to avoid additional
load on source systems
LOG-BASED
Data Warehouse / DHBWDaimler TSS 25
26. LOG-BASED (SAMPLE PRODUCT ARCHITECTURE IIDR)
Data Warehouse / DHBWDaimler TSS 26
Frontend
Standard
Reports
AdHoc
Reports
IIDR
ReplEngine
Source
Datastore
Source
OLTP
DB
IIDR ReplEngine
DWH
Datastore
DWH
DWH DB
Staging Layer
Core Layer
Mart Layer
Transaction
Logs
27. Replication techniques
Data replication
• Target tables not necessarily on local system
• Uses typically Transaction Logs
• Log reader identifies insert, update, delete, truncates and writes the changes into
replicated tables (insert remains insert, update remains update, etc)
• Useful for 1:1 copies (e.g. ODS, Operational Data Store) but still challenge to detect
changes for loading the data mart
REPLICATION-BASED
Data Warehouse / DHBWDaimler TSS 27
28. REPLICATION-BASED (SAMPLE PRODUCT ARCHITECTURE
IIDR)
Data Warehouse / DHBWDaimler TSS 28
Frontend
Standard
Reports
AdHoc
Reports
IIDR
ReplEngine
Source
Datastore
Source
OLTP
DB
IIDR ReplEngine
DWH
Datastore
DWH
DWH DB
Staging Layer
Core Layer
Mart Layer
Transaction
Logs
29. Timestamp-based discovery
• Every data item in a table is associated with timestamp information about its
validity period
• Changed data can be determined from this timestamp information
TIMESTAMP-BASED
Data Warehouse / DHBWDaimler TSS 29
30. Sample customer table in OLTP
• Each table gets Change timestamp
• Delta process reads latest data only (e.g. ChangeTimestamp >= <yesterday>)
• Problem: it is not possible to identify deleted rows
TIMESTAMP-BASED
Data Warehouse / DHBWDaimler TSS 30
CustomerID Name Department Change Timestamp
1 Miller DWH 15.01.2015 17:00:01
2 Powell DB 22.03.2016 08:30:22
31. Data comparison
Comparison of snapshots of the operational data at different points in time
• Compute difference between two latest snapshots
• E.g. unload all data from a table into a file and diff newest file content with latest file
content
Can be very complex
Sometimes the only possibility, for instance for legacy applications
High performance impact on source
SNAPSHOT-BASED
Data Warehouse / DHBWDaimler TSS 31
32. MONITORING TECHNIQUES COMPARISON
Data Warehouse / DHBWDaimler TSS 32
Trigger-based Replication
techniques
Log-based
discovery
Timestamp-
based discovery
Snapshot-based
discovery
Performance
impact on source
system
Medium Low Low Medium High
Performance
impact on target
system
Low Low Low Low High
Load on network Low Low Low Low High
Data loss if
nologging
operations
No Yes Yes No No
33. MONITORING TECHNIQUES COMPARISON
Data Warehouse / DHBWDaimler TSS 33
Trigger-based Replication
techniques
Log-based
discovery
Timestamp-
based discovery
Snapshot-based
discovery
Identify DELETE
operations
Yes Yes Yes No Yes
Identify ALL
changes (changes
between
extractions)
Yes Yes Yes No No
34. Direct Access
• Source writes data into target or
• Target reads data from source
• Security concerns
• High coupling / dependencies
DATA TRANSPORT – DIRECT ACCESS
Data Warehouse / DHBWDaimler TSS 34
Source Target
35. File transfer (or other transport medium)
• csv, json, xml, binary, etc
• Transfer data by scp, rfts (reliable file transfer system), ESB (enterprise service
bus), SOA (service oriented architecture), etc
• Often high amounts of data, therefore bulk transfer of compressed data most
widely used
• Better decoupling of source and target
DATA TRANSPORT – FILE TRANSFER
Data Warehouse / DHBWDaimler TSS 35
Source Targetfiles
36. Extraction intervals
• Periodically – in regular intervals
• Every day, week, etc.
• Instantly / Continuous
• Every change is directly propagated into the data warehouse
• „real time data warehouse“
• Depends on the requirements on timeliness of the data warehouse data
EXTRACTION INTERVALS
Data Warehouse / DHBWDaimler TSS 36
37. Triggered by a specific request
• Addition of a new product
• Query which involves more recent data
Triggered by specific events
• Number of changes in operational data exceeds threshold
EXTRACTION INTERVALS
Data Warehouse / DHBWDaimler TSS 37
38. • Profile Existing Data Sources, Extracted Data
• Analyze data structure, content, and quality
• Find data relationships across systems
• Often badly documented or missing foreign keys
• Uncover data issues that can affect subsequent transformation steps
• Missing values
• Duplicates
• Inconsistencies
PREREQUISITE OF ETL - UNDERSTANDING THE DATA
Data Warehouse / DHBWDaimler TSS 38
39. DATA QUALITY ISSUES
Data Warehouse / DHBWDaimler TSS 39
CustomerNo Name Birthdate Age Gender Zip code
1 Miller, Tom 33.01.2001 15 M NULL
1 John Mayor 15.01.2001 15 M 98144
2 Mrs. Bush 31.10.1988 22 Q 00000
3 Martin 31.10.1988 22 M 75890
PK / Unique Key violated Data not uniform Not valid
Inconsistent Wrong value
Unknown / missing
FK violated
40. DATA QUALITY ISSUES AND POSSIBLE SOLUTIONS IN
THE SOURCE RDBMS
Data Warehouse / DHBWDaimler TSS 40
Issue Solution
Wrong data e.g. 31.02.2016 Proper data type definition
Wrong values, e.g. number out of range Check constraint
Missing values NOT NULL constraint
Violated references FOREIGN KEY constraint
Duplicates PRIMARY or UNIQUE KEY constraint
Inconsistent data ACID transactions, business logic, additional checks
41. DATA QUALITY ISSUES AND POSSIBLE SOLUTIONS IN
THE SOURCE RDBMS
Data Warehouse / DHBWDaimler TSS 41
Issue Solution
Wrong data e.g. 31.02.2016 Proper data type definition
Wrong values, e.g. number out of range Check constraint
Missing values NOT NULL constraint
Violated references FOREIGN KEY constraint
Duplicates PRIMARY or UNIQUE KEY constraint
Inconsistent data ACID transactions, business logic, additional checks
42. Correcting the data
• Automatically during ETL
• E.g., address of a customer if a correct reference table exists
• Manually after ETL is finished
• ETL stored bad data in error log tables or files
• ETL flags bad data (e.g. invalid)
DATA QUALITY ISSUES: WORKAROUNDS IN DWH
Data Warehouse / DHBWDaimler TSS 42
43. Correcting the data
• In the source systems
• Common master data management across all operational applications
• Dedicated systems are “master” of e.g. customer data
• Correcting the data at the source is best approach but slow and often not feasible
DATA QUALITY ISSUES: CORRECT DATA IN THE SOURCE
Data Warehouse / DHBWDaimler TSS 43
44. • Column is null
• Reject data
• Use default values
• Missing values can represent
• an unknown value Iike date of birth of a customer
• a missing value like engine_id for a car (logical not null constraint)
• Dimension tables can include some dummy values:
DATA QUALITY ISSUES: MISSING DATA
Data Warehouse / DHBWDaimler TSS 44
DimensionTable_X Description
-1 Unknown
-2 Missing
45. • Data is inaccurate
e.g. wrong date 32.12.2015
or wrong number 55U
• Reject data
• Replace with value that represents „Invalid“
• Dimension tables can include some dummy values:
DATA QUALITY ISSUES: MISSING DATA
Data Warehouse / DHBWDaimler TSS 45
DimensionTable_X Description
-1 Unknown
-2 Missing
-3 Invalid
46. • Data has conflicts, e.g. wrong postal code 80995 Stuttgart
• Reject data
• Replace one of the values with a value that represents „Invalid“ or with
corrected value
Which value to replace? Rules necessary
DATA QUALITY ISSUES: CONFLICTING DATA
Data Warehouse / DHBWDaimler TSS 46
47. • Data is inconsistent, e.g. Order date after payment date or unlikely high
price for a product
• Can be discovered by statistical and data mining methods
DATA QUALITY ISSUES: INCONSISTENT DATA
Data Warehouse / DHBWDaimler TSS 47
48. • Data is duplicated, e.g. „Martin Miller” vs “Miller, Martin” vs “M.Miller”
• Multiple representations for one entity
• Different keys
• Different encodings
• Duplicate detection can be very difficult / tricky
• Products are available for e.g.
address duplicate detection
address validation (Kingstreet = does this address actually exist?)
address harmonization (Kingstr, Kingstreet, King Street, etc)
• Standardize / Harmonize data during ETL flow: “unification” for better
duplicate detection
DATA QUALITY ISSUES: DUPLICATES
Data Warehouse / DHBWDaimler TSS 48
49. • Unification of data types
• Character string date „20.01.2006“ 20.01.2006
• Character string number „12345“ 12345
• Unification of encodings
• For instance for gender F and M
• Lookup-tables contain the mapping from old to new encodings
• Combination of different attributes to one attribute
• day, month, year date
TRANSFORM - UNIFICATION OF DATA
Data Warehouse / DHBWDaimler TSS 49
50. • Split of one attribute into two or more
• Name first name, last name (“Herr Prof. Dr. Hans M. vom und zum Stein”)
• Unification of names can become very challenging “Herr Prof. Dr. Hans M. vom
und zum Stein” or “Werner Martin” or “Mariae Gloria … Wilhelmine Huberta Gräfin
von Schönburg-Glauchau“
• Product name - „Cola, 0.33 l“
Product short name - „Cola“, size in liters - 0.33
TRANSFORM - UNIFICATION OF DATA
Data Warehouse / DHBWDaimler TSS 50
51. • Unification of dates and timestamps
• Rules for representing incomplete date information
If only month and year are known
• Dates and timestamps with regard to one specific timezone
Important for multi-national organizations
UTC Coordinated Universal Time without daylight saving zone
• What can happen if clock is changed to winter time if no UTC is used?
- Update arrives at 02:15 in staging layer (CDC / log-based monitor)
- Clock is changed to winter time: -1h
- Update of the same row arrives at 02:10 in staging layer (CDC / log-based)
- How can batch load running the next night discover which update is the most
recent one?
TRANSFORM - UNIFICATION OF DATA
Data Warehouse / DHBWDaimler TSS 51
52. • Computation of derived values
• Profit = sales price – purchase price
Without clear definition, different interpretations possible
• Net or gross sales price?
• Net or gross purchase price?
• Aggregations
• Revenue of the year computed from revenues of the day
Without clear definition, different interpretations possible
• Calendar year?
• Fiscal year?
TRANSFORM - UNIFICATION OF DATA
Data Warehouse / DHBWDaimler TSS 52
53. • Specification between source and target columns
• Source tables + columns
• Target table + columns
• Join rules
• Filter criteria
• Transformation rules
DATA MAPPING
Data Warehouse / DHBWDaimler TSS 53
54. • Efficient load operations are important
• bulk load: Single row processing vs set based processing
• Online load
• Data warehouse (especially Data Mart) is still accessible
• Offline load
• Data warehouse (especially Data Mart) is offline
• For updates that require the recomputation of a cube
• Offline load is often a Tool limit because the Tool locks data structures. But offline
load could be faster.
LOAD
Data Warehouse / DHBWDaimler TSS 54
55. • Specific Bulk load operations provided by RDBMS, e.g. External tables in
Oracle or LOAD command in DB2
• Single row vs set based processing
BULK PROCESSING
Data Warehouse / DHBWDaimler TSS 55
Single row processing Set based processing
Cursor curs = SELECT * FROM <source>
WHILE NOT EOF(curs)
FETCH NEXT ROW INTO myRoW;
INSERT INTO <target> VALUES(myRow);
LOOP
INSERT INTO <target>
SELECT * from <source>
Error handling easy All or nothing if there are errors
Slow for high amounts of data Performs well for small and high amounts of data
More coding Less code = less errors
56. ETL-JOB PARALLELISM FOR LOADING DATA INTO CORE
WAREHOUSE LAYER
Data Warehouse / DHBWDaimler TSS 56
HUBloaded
LINKundHUB-
SATloaded
LINK-SATloaded
DataVault
Load
Classical
Load
?
? ?
Integration of new JobsTime Windows for Loads, e.g 00:00-06:00
• Complex
• Many dependencies
• Many sequential jobs
• Systematic / Methodic
• Few, well defined dependencies
• Massive parallel
57. Draw a flow diagram how to load a HUB, LINK and SAT table and describe the
SQL statements
EXERCISE: LOAD DATA VAULT TABLE
Data Warehouse / DHBWDaimler TSS 57
58. EXERCISE: LOAD HUB TABLE
Data Warehouse / DHBWDaimler TSS 58
Source
data exist
Load distinct
business keys
Does
business
Key exist in
HUB?
Insert row into
HUB
Conflict if PK
HashKey
collision!
no
Reject
data
Data loaded into
HUB
yes
59. INSERT INTO core.fahrzeug (vehicle_hk, fin, loaddate, recordsource)
SELECT DISTINCT f.fahrzeug_hashkey
, f.fin_bk
, f.loaddate
, f.recordsource
FROM staging.fahrzeugdaten f
WHERE f.fin_bk NOT in (SELECT fin FROM core.hub_fahrzeug)
AND f.loaddate = <date to load>;
EXERCISE: LOAD HUB TABLE
Data Warehouse / DHBWDaimler TSS 59
60. EXERCISE: LOAD LINK TABLE
Data Warehouse / DHBWDaimler TSS 60
Source
data exist
Load distinct
business keys
Does Hash
Key
relationship
exist in
HUB?
Insert row into
LINK
Conflict if PK
HashKey
collision!
no
Reject
data
Data loaded into
LINK
yes
61. INSERT INTO core.link_verbaut (verbaut_hk, motor_hk, vehicle_hk, loaddate, recordsource)
SELECT DISTINCT h.verbaut_hk
, f.motor_hashkey
, f.fahrzeug_hashkey
, f.loaddate
, f.recordsource
FROM staging.fahrzeugdaten f
WHERE (f.motor_hashkey, f.fahrzeug_hashkey) NOT in (SELECT motor_hk, vehicle_hk FROM
core.link_verbaut v)
AND loaddate = <date to load>;
EXERCISE: LOAD LINK TABLE
Data Warehouse / DHBWDaimler TSS 61
62. EXERCISE: LOAD SAT TABLE
Data Warehouse / DHBWDaimler TSS 62
Source
data exist
Load
distinct
source
data
MD5-
HASH
Diff
identical?
Insert row into
SAT
no
Reject
data
Data loaded into
SAT
yes
Load
current/
latest row
from SAT
table
63. INSERT INTO core.sat_fahrzeug_text (vehicle_hk, loaddate, recordsource, md5_hash, codeleiste, kommentar)
SELECT DISTINCT f.fahrzeug_hashkey
, f.loaddate
, f.recordsource
, f.md5hash
, f.codeleiste
, f.kommentar
FROM staging.fahrzeugdaten f
LEFT OUTER JOIN (select s.vehicle_hk, s.md5_hash from s_fahrzeug s JOIN (select i.VEHICLE_HK, max(i.loaddate) as loaddate from
s_fahrzeug i GROUP BY i.VEHICLE_HK) m
ON s.vehicle_hk = m.vehicle_hk AND s.loaddate = m.loaddate) k ON f.fahrzeug_hashkey = k.vehicle_hk
WHERE (k.md5_hash is null OR f.md5hash <> k.md5_hash)
AND f.loaddate = <date to load>;
EXERCISE: LOAD SAT TABLE
Data Warehouse / DHBWDaimler TSS 63
64. LOGICAL STANDARD DATA WAREHOUSE ARCHITECTURE
Data Warehouse / DHBWDaimler TSS 64
Data Warehouse
FrontendBackend
External data sources
Internal data sources
Staging
Layer
(Input
Layer)
OLTP
OLTP
Core
Warehouse
Layer
(Storage
Layer)
Mart Layer
(Output
Layer)
(Reporting
Layer)
Integration
Layer
(Cleansing
Layer)
Aggregation
Layer
Metadata Management
Security
DWH Manager incl. Monitor
? ? ? ?
66. • After the end of this lecture you will be able to
• Understand DB techniques that are specific for DWH
• Bitemporal data
• Indexing, Partitioning, Parallelism
WHAT YOU WILL LEARN TODAY
Data Warehouse / DHBWDaimler TSS 66
67. TEMPORAL DATA STORAGE (BITEMPORAL DATA)
Data Warehouse / DHBWDaimler TSS 67
10.09. 20.09. 30.09. 10.10.
Time
Price: 15EUR Price: 16EUR
New Price of 16EUR is
entered into the DB
Valid
Time
(20.09.)
Transaction
Time
(10.09.)
68. Valid time is the time period during which a fact is true in the real world.
Transaction time is the time period during which a fact stored in the
database was known.
Bitemporal data combines both Valid and Transaction Time.
Source: (Wikipedia, https://en.wikipedia.org/wiki/Temporal_database)
TEMPORAL DATA STORAGE (BITEMPORAL DATA)
Data Warehouse / DHBWDaimler TSS 68
69. • SQL standard SQL:2011
• But different implementations by RDBMSes like Oracle, DB2, SQL Server
and others
• Different syntax!
• Different coverage of standard!
• Very useful for slowly changing dimensions type 2, but also for other
purposes
TEMPORAL DATA STORAGE (BITEMPORAL DATA)
Data Warehouse / DHBWDaimler TSS 69
70. CREATE TABLE customer_address
( customerID INTEGER NOT NULL
, name VARCHAR(100)
, city VARCHAR(100)
, valid_start DATE NOT NULL
, valid_end DATE NOT NULL
, PERIOD BUSINESS_TIME(valid_start, valid_end)
, PRIMARY KEY(customerID, BUSINESS_TIME WITHOUT OVERLAPS) );
DB2 VALID TIME EXAMPLE
Data Warehouse / DHBWDaimler TSS 70
71. INSERT INTO customer_address VALUES
(1, 'Miller', 'Seattle', '01.01.2013', '31.12.2013');
UPDATE customer_address FOR PORTION OF BUSINESS_TIME
FROM '22.05.2013' TO '31.12.2013'
SET city = 'San Diego' WHERE customerID = 1;
DB2 VALID TIME EXAMPLE
Data Warehouse / DHBWDaimler TSS 71
customerID Name City Valid_start Valid_end
1 Miller Seattle 01.01.2013 22.05.2013
1 Miller San Diego 22.05.2013 31.12.2013
73. CREATE TABLE customer_info(
customerId INTEGER NOT NULL,
comment VARCHAR(1000) NOT NULL,
sys_start TIMESTAMP(12) NOT NULL GENERATED ALWAYS AS ROW BEGIN,
sys_end TIMESTAMP(12) NOT NULL GENERATED ALWAYS AS ROW END,
PERIOD SYSTEM_TIME (sys_start, sys_end)
);
DB2 TRANSACTION TIME EXAMPLE
Data Warehouse / DHBWDaimler TSS 73
74. Transaction on 15.10.2013:
INSERT INTO customer_info VALUES( 1, 'comment 1');
Transaction on 31.10.2013
UPDATE customer_address SET comment = 'comment 2'
WHERE customerID = 1;
DB2 TRANSACTION TIME EXAMPLE
Data Warehouse / DHBWDaimler TSS 74
CustomerId comment Sys_start Sys_end
1 Comment 2 31.10.2013 31.12.2999
75. SELECT *
FROM customer_info FOR SYSTEM_TIME AS OF '17.10.2013';
Data comes from a history table:
Valid Time and Transaction Time can be combined = Bitemporal table
DB2 TRANSACTION TIME EXAMPLE
Data Warehouse / DHBWDaimler TSS 75
CustomerId comment Sys_start Sys_end
1 Comment 1 15.10.2013 31.10.2013
76. • Very important performance improvement technique
• Good for many reads with high selectivity, write penalty
• B-trees most common
INDEXING - WHY
Data Warehouse / DHBWDaimler TSS 76
root
branch branch
leaf leaf leaf
…
…
Table
77. • DBs index Primary Keys by default
• Dimension table columns
that are regularly used in where clauses
are candidates
• Maybe foreign Key columns in Fact table
(see also later Star Transformation)
INDEXING A STAR SCHEMA – WHICH COLUMNS ARE
CANDIDATES FOR AN INDEX?
Data Warehouse / DHBWDaimler TSS 77
78. • Fact table has normally much more rows compared to dimension tables
• Common join techniques would need to join first dimension table with the
fact table
• Alternative technique: evaluate all dimensions
(cartesian join)
• Then join into fact table in last step
• Oracle uses Bitmap indexes on
foreign key columns in fact tables to achieve
Star Join; not supported by many DBs
STAR TRANSFORMATION
Data Warehouse / DHBWDaimler TSS 78
79. • Suppose you have a fact table containing data for last 10 years with
millions of rows but you are interested in only in
• Data from yesterday
• From last 2 years
How could you improve performance?
EXERCISE: PERFORMANCE
Data Warehouse / DHBWDaimler TSS 79
80. • Suppose you have a fact table containing data for last 10 years with
millions of rows but you are interested in only in
• Columnar In-memory DB may be an option in general (the option has already been
discussed during the lecture)
• Data from yesterday
• Indexing might be a good choice as not much rows are read
• From last 2 years
• Indexing most likely is a bad choice as reading a rather high amount of data via
an index quickly becomes inefficient
• Partitioning
EXERCISE: PERFORMANCE
Data Warehouse / DHBWDaimler TSS 80
81. PARTITIONING
Data Warehouse / DHBWDaimler TSS 81
Col1 Col2 Col3 col4
1 A AA AAA
2 B BB BBB
3 C CC CCC
Col1 Col2
1 A
2 B
3 C
Col3 col4
AA AAA
BB BBB
CC CCC
Col1 Col2 Col3 col4
3 C CC CCC
Col1 Col2 Col3 col4
1 A AA AAA
2 B BB BBB
Vertical partitioning (sharding) Horizontal partitioning
82. • Very powerful feature in a DWH to reduce workload
• Split table into logical smaller tables
• Avoidance of full table scans
• How could a table be split?
HORIZONTAL PARTITIONING
Data Warehouse / DHBWDaimler TSS 82
83. • By range
• Most common
• Use date field like order data to partition table into months, days, etc
• By list
• Use field that has limited number of different values, e.g. split customer data by
country if end users most likely select customers from within a country
• By hash
• Use a filed that most likely splits the data in evenly distributed chunks
HORIZONTAL PARTITIONING – SPLITTING OPTIONS
Data Warehouse / DHBWDaimler TSS 83
84. • Statements are normally executed on one CPU
• Parallelism allows the DB to distribute the execution to several CPUs
• Powerful combination with partitioning
• Parallelism is limited by the number of CPUs: if parallelism is too high,
performance will degrade
• Intra-query parallelism and inter-query parallelism
PARALLELISM
Data Warehouse / DHBWDaimler TSS 84
85. • Relational columnar In-Memory DB
• Materialized Views / Query Tables
ALREADY COVERED IN A PREVIOUS LECTURE
Data Warehouse / DHBWDaimler TSS 85
86. Daimler TSS GmbH
Wilhelm-Runge-Straße 11, 89081 Ulm / Telefon +49 731 505-06 / Fax +49 731 505-65 99
tss@daimler.com / Internet: www.daimler-tss.com/ Intranet-Portal-Code: @TSS
Domicile and Court of Registry: Ulm / HRB-Nr.: 3844 / Management: Christoph Röger (CEO), Steffen Bäuerle
Data Warehouse / DHBWDaimler TSS 86
THANK YOU
Editor's Notes
Mission: Wir sind Spezialist und strategischer Business-Partner für innovative IT-Gesamtlösungen im Daimler-Konzern – not just another supplier! more than another supplier!
Stammdatenmanagement (englisch Master Data Management, MDM) umfasst alle strategischen, organisatorischen, methodischen und technologischen Aktivitäten in Bezug auf die Stammdaten eines Unternehmens.