SlideShare a Scribd company logo
1 of 8
Download to read offline
Prof. Hasso Plattner
A Course in
In-Memory Data Management
The Inner Mechanics
of In-Memory Databases
September 4, 2015
This learning material is part of the reading material for Prof.
Plattner’s online lecture "In-Memory Data Management" taking place at
www.openHPI.de. If you have any questions or remarks regarding the
online lecture or the reading material, please give us a note at openhpi-
imdb@hpi.uni-potsdam.de. We are glad to further improve the material.
Research Group "Enterprise Platform and Integration Concepts",
http://epic.hpi.uni-potsdam.de
Chapter 3
Enterprise Application Characteristics
3.1 Enterprise Data Sources
An enterprise data management system should be able to handle data com-
ing from several different data sources. In the ecosystem of modern enter-
prises, many applications work on and produce structured data. Enterprise
Resource Planning (ERP) systems, for example, typically create transactional
data to capture the operations of a business. More and more event and stream
data is created by modern manufacturing machines and sensors. At the same
time, large amounts of unstructured data is captured from the web, social
networks, log files, support systems, and others.
Business users need to query these different data sources as fast as possi-
ble to derive business value from the data or coordinate the operations of the
enterprise. Real-time analytics applications may work on any of the data
sources and combine them. They analyze the structured data of ERP systems
for real-time transactional reporting, classical analytics, planning, and sim-
ulation. The data from other data sources can be taken into account as well,
for example, a text analytics application can combine a customer sentiment
analysis on social network data with sales numbers or a production planning
application takes sensor data of the RFID sensors into account.
3.2 OLTP vs. OLAP
An enterprise data management system should be able to handle transac-
tional and analytical query types, which differ in several dimensions. Typi-
cal queries for Online Transaction Processing (OLTP) can be the creation of
sales orders, invoices, accounting data, the display of a sales order for a single
customer, or the display of customer master data. Online Analytical Pro-
cessing (OLAP) is used to summarize data, often for management reports.
17
18 3 Enterprise Application Characteristics
Typical analytical reports are for example the sales figures aggregated and
grouped by regions, different timeframes and products or the calculation of
Key Performance Indicators (KPIs).
Because it has always been considered that these query types are signifi-
cantly different, the data management systems were split into two separate
systems to tune the data storage and schemas accordingly. In the litera-
ture, it is claimed that OLTP workloads are write-intensive, whereas OLAP-
workloads are read-mostly and that the two workloads rely on “Opposing
Laws of Database Physics” [Fre95].
Yet, data and query analysis of current enterprise systems showed that
this statement is not true [KGZP10, KKG+11]. The main difference between
systems that handle these query types is that OLTP systems handle more
queries with a single select or queries that are highly selective returning
only a few tuples, whereas OLAP systems calculate aggregations for only a
few columns of a table, but for a large number of tuples.
For the synchronization of the analytical system with the transactional
system(s), a cost-intensive ETL (Extract-Transform-Load) process is required.
The ETL process introduces a delay and is relatively complex, because all
relevant changes have to be extracted from the outside source or sources if
there are several, data is transformed to fit analytical needs, and it is loaded
into the target database.
3.3 Drawbacks of the Separation of OLAP from OLTP
While the separation of the database into two systems allows for workload
specific optimizations in both systems, it also has a number of drawbacks:
• The OLAP system does not have the latest data, because the ETL process
introduces a delay. The delay can range from minutes to hours, or even
days. Consequently, many decisions have to rely on stale data instead of
using the latest information.
• To achieve acceptable performance, OLAP systems work with predefined,
materialized aggregates which reduce the query flexibility of the user.
• Data redundancy is high. Similar information is stored in both systems,
just differently optimized.
• The schemas of the OLTP and OLAP systems are different, which intro-
duces complexity for applications using both of them and for the ETL
process synchronizing data between the systems.
3.5 Combining OLTP and OLAP data 19
3.4 The OLTP vs. OLAP Access Pattern Myth
Transactional processing is often considered to have an equal share of read
and write operations, while analytical processing is dominated by large reads
and range queries. However, a workload analysis of multiple real customer
systems reveals that OLTP and OLAP systems are not as different as expected
in classic enterprise systems. As shown in [KKG+11], even OLTP systems
process over 80% of read queries, and only a fraction of the workload contains
write queries. Less than 10% of the actual workload are queries that modify
existing data, e.g. updates and deletes. OLAP systems process an even larger
amount of read queries, which make up about 95% of the workload.
The updates in the transactional workload are of particular interest. An
analysis of the updates in different industries is shown in Figure 3.1. It
confirms that the number of updates in OLTP systems is quite low [KKG+11],
and varies between industries. In the analyzed high-tech companies, the
update rate peaks at about 12 %, meaning that about 88 % of all tuples saved
in the transactional database are never updated. In other sectors, research
showed even lower update rates, e.g., less than 1 % in banking and discrete
manufacturing [KKG+11].
These results lead to the assumption that updates and deletes can be
implemented by inserting tuples with timestamps to record the period of
their validity.
The additional benefit of this so called insert-only approach is that the
complete transactional data history and a tuple’s life cycle are saved in the
database automatically, avoiding the need for log-keeping in the application
for auditing reasons. More details about the insert-only approach will be
provided in Chapter 26.
We can conclude that the characteristics of the two workload categories
are not that different after all, which leads to the vision of reuniting the two
systems and to combine OLTP and OLAP data in one system.
3.5 Combining OLTP and OLAP data
The main benefit of the combination is that both, transactional and analytical
queries can be executed on the same database using the same set of data as
a “single source of truth”. Thereby, the costly ETL process becomes obsolete
and all queries are performed against the latest version of the data.
In this book we show that with the use of modern hardware we can elim-
inate the need for pre-computed aggregates and materialized views. Data
aggregation can be executed on-demand and analytical views can be pro-
vided without delay. With the expected response time of analytical queries
below one second, it is possible to perform analytical query processing on the
transactional data directly, anytime and from any device. By dropping the
20 3 Enterprise Application Characteristics
Industry Percentage.of.Updated.Rows.in.Financial.Accounting
CPG 6% 6,2
High,Tech 12% 12,3
Logistics 4% 3,9
Discrete:M. 1% 0,9
Banking 1% 1
0%:
2%:
4%:
6%:
8%:
10%:
12%:
14%:
CPG: High,Tech: LogisCcs: Discrete:M.: Banking:
Percentage.of.Updated.Rows.
Fig. 3.1: Updates of a Financial Application in different Industries
pre-computation of aggregates and materialization of views, applications
and data structures can be simplified. The management of aggregates and
views (building, maintaining, and storing them) is not necessary any longer.
The resulting mixed workload combines the characteristics of OLAP and
OLTP workloads. A part of the queries in the workload performs typical
transactional request like the selection of a few, complete rows. Others aggre-
gate large amounts of the transactional data to generate real-time analytical
reports on the latest data. Especially applications that inherently use access
patterns from both workload groups and need access to the up-to-date data
benefit greatly from fast access to large amounts of transactional data, e.g.
dunning or planning applications.
More application examples are given in Chapter 35.
3.6 Enterprise Data is Sparse Data
By analyzing enterprise data in standard software, special data characteris-
tics were identified. Most interestingly, most tables are very wide and contain
hundreds of columns. However, many attributes of such table are not used
at all: 55 % of all columns are unused on average per company. This is due
to the fact, that standard software needs to support many workflows in dif-
ferent industries and countries, however a single company never uses all of
them. Further, in many columns NULL or default values are dominant, so
the entropy (information containment) of these columns is very low (near
zero).
But even the columns that are used by a specific company often have
a low cardinality of values, i.e., there are very few distinct values. Often
due to the fact that the data models the real world, and every company has
REFERENCES 21
only a limited number of products that can be sold, to a limited number of
customers, by a limited number of employees and so on.
These characteristics facilitate the efficient use of compression techniques
that we will introduce in Chapter 7, leading to lower memory consumption
and better query performance.
3.7 References
[Fre95] Clark D. French. “One Size Fits All” Database Architectures Do
Not Work for DSS. SIGMOD Rec., 24(2):449–450, May 1995.
[KGZP10] J. Krueger, M. Grund, A. Zeier, and H. Plattner. Enterprise
Application-specific Data Management. In EDOC, pages 131–
140, 2010.
[KKG+11] J. Krueger, C. Kim, M. Grund, N. Satish, D. Schwalb, J. Chhugani,
H. Plattner, P. Dubey, and A. Zeier. Fast Updates on Read-
Optimized Databases Using Multi-Core CPUs. PVLDB, 2011.
Enterprise application characteristics

More Related Content

What's hot

Data warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersData warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersSourav Singh
 
Best Practices Integration And Interoperability
Best  Practices    Integration And  InteroperabilityBest  Practices    Integration And  Interoperability
Best Practices Integration And InteroperabilityAllinConsulting
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouseJ M
 
Appfluent - Transforming the Economics of Big Data
Appfluent - Transforming the Economics of Big DataAppfluent - Transforming the Economics of Big Data
Appfluent - Transforming the Economics of Big DataAppfluent Technology
 
Sap Interview Questions - Part 1
Sap Interview Questions - Part 1Sap Interview Questions - Part 1
Sap Interview Questions - Part 1ReKruiTIn.com
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform LoadABDUL KHALIQ
 
Solving big data challenges for enterprise application
Solving big data challenges for enterprise applicationSolving big data challenges for enterprise application
Solving big data challenges for enterprise applicationTrieu Dao Minh
 
Hand Coding ETL Scenarios and Challenges
Hand Coding ETL Scenarios and ChallengesHand Coding ETL Scenarios and Challenges
Hand Coding ETL Scenarios and Challengesmark madsen
 
SAP HANA Integrated with Microstrategy
SAP HANA Integrated with MicrostrategySAP HANA Integrated with Microstrategy
SAP HANA Integrated with Microstrategysnehal parikh
 
Basics of Microsoft Business Intelligence and Data Integration Techniques
Basics of Microsoft Business Intelligence and Data Integration TechniquesBasics of Microsoft Business Intelligence and Data Integration Techniques
Basics of Microsoft Business Intelligence and Data Integration TechniquesValmik Potbhare
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasiryasir873
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Informatica
InformaticaInformatica
Informaticamukharji
 
MicroStrategy 9 - Extending Business Intelligence
MicroStrategy 9 - Extending Business IntelligenceMicroStrategy 9 - Extending Business Intelligence
MicroStrategy 9 - Extending Business IntelligenceMicroStrategy Nederland
 

What's hot (20)

Data warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersData warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswers
 
Best Practices Integration And Interoperability
Best  Practices    Integration And  InteroperabilityBest  Practices    Integration And  Interoperability
Best Practices Integration And Interoperability
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
Appfluent - Transforming the Economics of Big Data
Appfluent - Transforming the Economics of Big DataAppfluent - Transforming the Economics of Big Data
Appfluent - Transforming the Economics of Big Data
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Sap Interview Questions - Part 1
Sap Interview Questions - Part 1Sap Interview Questions - Part 1
Sap Interview Questions - Part 1
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Data warehousing unit 1
Data warehousing unit 1Data warehousing unit 1
Data warehousing unit 1
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform Load
 
Solving big data challenges for enterprise application
Solving big data challenges for enterprise applicationSolving big data challenges for enterprise application
Solving big data challenges for enterprise application
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Hand Coding ETL Scenarios and Challenges
Hand Coding ETL Scenarios and ChallengesHand Coding ETL Scenarios and Challenges
Hand Coding ETL Scenarios and Challenges
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
SAP HANA Integrated with Microstrategy
SAP HANA Integrated with MicrostrategySAP HANA Integrated with Microstrategy
SAP HANA Integrated with Microstrategy
 
Basics of Microsoft Business Intelligence and Data Integration Techniques
Basics of Microsoft Business Intelligence and Data Integration TechniquesBasics of Microsoft Business Intelligence and Data Integration Techniques
Basics of Microsoft Business Intelligence and Data Integration Techniques
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
ETL QA
ETL QAETL QA
ETL QA
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
Informatica
InformaticaInformatica
Informatica
 
MicroStrategy 9 - Extending Business Intelligence
MicroStrategy 9 - Extending Business IntelligenceMicroStrategy 9 - Extending Business Intelligence
MicroStrategy 9 - Extending Business Intelligence
 

Viewers also liked

Seminario 7 segunda parte
Seminario 7 segunda parteSeminario 7 segunda parte
Seminario 7 segunda partenatsancol
 
Edital e Anexos do Processo 11/2016 – Pregão Presencial 07/2016
Edital e Anexos do Processo 11/2016 – Pregão Presencial 07/2016Edital e Anexos do Processo 11/2016 – Pregão Presencial 07/2016
Edital e Anexos do Processo 11/2016 – Pregão Presencial 07/2016Maria Julia Medeiros
 
ZZB1469 Glass Pearl Media brochure 82516
ZZB1469 Glass Pearl Media brochure 82516ZZB1469 Glass Pearl Media brochure 82516
ZZB1469 Glass Pearl Media brochure 82516Kristina Macias
 
Jak wymienić pompę hydrauliczna?
Jak wymienić pompę hydrauliczna?Jak wymienić pompę hydrauliczna?
Jak wymienić pompę hydrauliczna?businesspo
 
Carrera profesional.derecho 1
Carrera profesional.derecho 1Carrera profesional.derecho 1
Carrera profesional.derecho 1lucita1010
 
Jakie usterki występują w klimatyzatorach?
Jakie usterki występują w klimatyzatorach?Jakie usterki występują w klimatyzatorach?
Jakie usterki występują w klimatyzatorach?businesspo
 
Baldung paper final type
Baldung paper final typeBaldung paper final type
Baldung paper final typeAnna Bean
 
Jak zbudować profesjonalny system crm?
Jak zbudować profesjonalny system crm?Jak zbudować profesjonalny system crm?
Jak zbudować profesjonalny system crm?businesspo
 
Appi jaatmed
Appi jaatmedAppi jaatmed
Appi jaatmedkairitw
 
Les chiffres clés de la Défense 2016
Les chiffres clés de la Défense 2016Les chiffres clés de la Défense 2016
Les chiffres clés de la Défense 2016Ministère des Armées
 
Vonnegut - Slaughterhouse 5
Vonnegut - Slaughterhouse 5Vonnegut - Slaughterhouse 5
Vonnegut - Slaughterhouse 5trevornorris
 
Parameter Kependudukan Papua Tahun 2015
Parameter Kependudukan Papua Tahun 2015Parameter Kependudukan Papua Tahun 2015
Parameter Kependudukan Papua Tahun 2015daldukpapua
 
Prota kls-1-k-13
Prota kls-1-k-13Prota kls-1-k-13
Prota kls-1-k-13Min Baros
 
Las aguas y la red hidrográfica en españa
Las aguas y la red hidrográfica en españaLas aguas y la red hidrográfica en españa
Las aguas y la red hidrográfica en españaestudiante
 
STUDENTS ATTITUDE AND ATTENDANCE powerpoint.
STUDENTS ATTITUDE AND ATTENDANCE powerpoint.STUDENTS ATTITUDE AND ATTENDANCE powerpoint.
STUDENTS ATTITUDE AND ATTENDANCE powerpoint.Trevor Parsons
 
Alla modeller är fel men några är användbara
Alla modeller är fel men några är användbaraAlla modeller är fel men några är användbara
Alla modeller är fel men några är användbaraADDQ
 

Viewers also liked (19)

Seminario 7 segunda parte
Seminario 7 segunda parteSeminario 7 segunda parte
Seminario 7 segunda parte
 
Edital e Anexos do Processo 11/2016 – Pregão Presencial 07/2016
Edital e Anexos do Processo 11/2016 – Pregão Presencial 07/2016Edital e Anexos do Processo 11/2016 – Pregão Presencial 07/2016
Edital e Anexos do Processo 11/2016 – Pregão Presencial 07/2016
 
Persbericht MKB Inkoop Onderzoek
Persbericht MKB Inkoop OnderzoekPersbericht MKB Inkoop Onderzoek
Persbericht MKB Inkoop Onderzoek
 
ZZB1469 Glass Pearl Media brochure 82516
ZZB1469 Glass Pearl Media brochure 82516ZZB1469 Glass Pearl Media brochure 82516
ZZB1469 Glass Pearl Media brochure 82516
 
Jak wymienić pompę hydrauliczna?
Jak wymienić pompę hydrauliczna?Jak wymienić pompę hydrauliczna?
Jak wymienić pompę hydrauliczna?
 
Carrera profesional.derecho 1
Carrera profesional.derecho 1Carrera profesional.derecho 1
Carrera profesional.derecho 1
 
Jakie usterki występują w klimatyzatorach?
Jakie usterki występują w klimatyzatorach?Jakie usterki występują w klimatyzatorach?
Jakie usterki występują w klimatyzatorach?
 
Baldung paper final type
Baldung paper final typeBaldung paper final type
Baldung paper final type
 
The godfather
The godfatherThe godfather
The godfather
 
Jak zbudować profesjonalny system crm?
Jak zbudować profesjonalny system crm?Jak zbudować profesjonalny system crm?
Jak zbudować profesjonalny system crm?
 
Appi jaatmed
Appi jaatmedAppi jaatmed
Appi jaatmed
 
Les chiffres clés de la Défense 2016
Les chiffres clés de la Défense 2016Les chiffres clés de la Défense 2016
Les chiffres clés de la Défense 2016
 
Hoja de medición meteorológica
Hoja de medición meteorológicaHoja de medición meteorológica
Hoja de medición meteorológica
 
Vonnegut - Slaughterhouse 5
Vonnegut - Slaughterhouse 5Vonnegut - Slaughterhouse 5
Vonnegut - Slaughterhouse 5
 
Parameter Kependudukan Papua Tahun 2015
Parameter Kependudukan Papua Tahun 2015Parameter Kependudukan Papua Tahun 2015
Parameter Kependudukan Papua Tahun 2015
 
Prota kls-1-k-13
Prota kls-1-k-13Prota kls-1-k-13
Prota kls-1-k-13
 
Las aguas y la red hidrográfica en españa
Las aguas y la red hidrográfica en españaLas aguas y la red hidrográfica en españa
Las aguas y la red hidrográfica en españa
 
STUDENTS ATTITUDE AND ATTENDANCE powerpoint.
STUDENTS ATTITUDE AND ATTENDANCE powerpoint.STUDENTS ATTITUDE AND ATTENDANCE powerpoint.
STUDENTS ATTITUDE AND ATTENDANCE powerpoint.
 
Alla modeller är fel men några är användbara
Alla modeller är fel men några är användbaraAlla modeller är fel men några är användbara
Alla modeller är fel men några är användbara
 

Similar to Enterprise application characteristics

Struggling with data management
Struggling with data managementStruggling with data management
Struggling with data managementDavid Walker
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
Thoughts on a research platform architecture: Simplify your application portf...
Thoughts on a research platform architecture: Simplify your application portf...Thoughts on a research platform architecture: Simplify your application portf...
Thoughts on a research platform architecture: Simplify your application portf...Pistoia Alliance
 
Fbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesFbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesCindy Irby
 
An Integrated ERP With Web Portal
An Integrated ERP With Web PortalAn Integrated ERP With Web Portal
An Integrated ERP With Web PortalTracy Morgan
 
Real-Time Data Warehouse Loading Methodology Ricardo Jorge S.docx
Real-Time Data Warehouse Loading Methodology Ricardo Jorge S.docxReal-Time Data Warehouse Loading Methodology Ricardo Jorge S.docx
Real-Time Data Warehouse Loading Methodology Ricardo Jorge S.docxsodhi3
 
A Web-Based ERP System For Business Services And Supply Chain Management App...
A Web-Based ERP System For Business Services And Supply Chain Management  App...A Web-Based ERP System For Business Services And Supply Chain Management  App...
A Web-Based ERP System For Business Services And Supply Chain Management App...Linda Garcia
 
Implementation of Data Marts in Data ware house
Implementation of Data Marts in Data ware houseImplementation of Data Marts in Data ware house
Implementation of Data Marts in Data ware houseIJARIIT
 
Emerging database landscape july 2011
Emerging database landscape july 2011Emerging database landscape july 2011
Emerging database landscape july 2011navaidkhan
 
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...LindaWatson19
 
Log analyzer Needle in a haystack
Log analyzer  Needle in a haystackLog analyzer  Needle in a haystack
Log analyzer Needle in a haystackCenterRetro
 
ERP and related technology
ERP and related technology ERP and related technology
ERP and related technology Usman Tariq
 

Similar to Enterprise application characteristics (20)

Struggling with data management
Struggling with data managementStruggling with data management
Struggling with data management
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Thoughts on a research platform architecture: Simplify your application portf...
Thoughts on a research platform architecture: Simplify your application portf...Thoughts on a research platform architecture: Simplify your application portf...
Thoughts on a research platform architecture: Simplify your application portf...
 
Fbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesFbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_services
 
CTP Data Warehouse
CTP Data WarehouseCTP Data Warehouse
CTP Data Warehouse
 
An Integrated ERP With Web Portal
An Integrated ERP With Web PortalAn Integrated ERP With Web Portal
An Integrated ERP With Web Portal
 
Real-Time Data Warehouse Loading Methodology Ricardo Jorge S.docx
Real-Time Data Warehouse Loading Methodology Ricardo Jorge S.docxReal-Time Data Warehouse Loading Methodology Ricardo Jorge S.docx
Real-Time Data Warehouse Loading Methodology Ricardo Jorge S.docx
 
Data warehouse presentation
Data warehouse presentationData warehouse presentation
Data warehouse presentation
 
Event Driven Architecture
Event Driven ArchitectureEvent Driven Architecture
Event Driven Architecture
 
A Web-Based ERP System For Business Services And Supply Chain Management App...
A Web-Based ERP System For Business Services And Supply Chain Management  App...A Web-Based ERP System For Business Services And Supply Chain Management  App...
A Web-Based ERP System For Business Services And Supply Chain Management App...
 
Copy of sec d (2)
Copy of sec d (2)Copy of sec d (2)
Copy of sec d (2)
 
Copy of sec d (2)
Copy of sec d (2)Copy of sec d (2)
Copy of sec d (2)
 
Issue in Data warehousing and OLAP in E-business
Issue in Data warehousing and OLAP in E-businessIssue in Data warehousing and OLAP in E-business
Issue in Data warehousing and OLAP in E-business
 
Business Impacts on SAP Deployments
Business Impacts on SAP DeploymentsBusiness Impacts on SAP Deployments
Business Impacts on SAP Deployments
 
Implementation of Data Marts in Data ware house
Implementation of Data Marts in Data ware houseImplementation of Data Marts in Data ware house
Implementation of Data Marts in Data ware house
 
Emerging database landscape july 2011
Emerging database landscape july 2011Emerging database landscape july 2011
Emerging database landscape july 2011
 
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
 
Log analyzer Needle in a haystack
Log analyzer  Needle in a haystackLog analyzer  Needle in a haystack
Log analyzer Needle in a haystack
 
ERP and related technology
ERP and related technology ERP and related technology
ERP and related technology
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Enterprise application characteristics

  • 1. Prof. Hasso Plattner A Course in In-Memory Data Management The Inner Mechanics of In-Memory Databases September 4, 2015 This learning material is part of the reading material for Prof. Plattner’s online lecture "In-Memory Data Management" taking place at www.openHPI.de. If you have any questions or remarks regarding the online lecture or the reading material, please give us a note at openhpi- imdb@hpi.uni-potsdam.de. We are glad to further improve the material. Research Group "Enterprise Platform and Integration Concepts", http://epic.hpi.uni-potsdam.de
  • 2.
  • 3. Chapter 3 Enterprise Application Characteristics 3.1 Enterprise Data Sources An enterprise data management system should be able to handle data com- ing from several different data sources. In the ecosystem of modern enter- prises, many applications work on and produce structured data. Enterprise Resource Planning (ERP) systems, for example, typically create transactional data to capture the operations of a business. More and more event and stream data is created by modern manufacturing machines and sensors. At the same time, large amounts of unstructured data is captured from the web, social networks, log files, support systems, and others. Business users need to query these different data sources as fast as possi- ble to derive business value from the data or coordinate the operations of the enterprise. Real-time analytics applications may work on any of the data sources and combine them. They analyze the structured data of ERP systems for real-time transactional reporting, classical analytics, planning, and sim- ulation. The data from other data sources can be taken into account as well, for example, a text analytics application can combine a customer sentiment analysis on social network data with sales numbers or a production planning application takes sensor data of the RFID sensors into account. 3.2 OLTP vs. OLAP An enterprise data management system should be able to handle transac- tional and analytical query types, which differ in several dimensions. Typi- cal queries for Online Transaction Processing (OLTP) can be the creation of sales orders, invoices, accounting data, the display of a sales order for a single customer, or the display of customer master data. Online Analytical Pro- cessing (OLAP) is used to summarize data, often for management reports. 17
  • 4. 18 3 Enterprise Application Characteristics Typical analytical reports are for example the sales figures aggregated and grouped by regions, different timeframes and products or the calculation of Key Performance Indicators (KPIs). Because it has always been considered that these query types are signifi- cantly different, the data management systems were split into two separate systems to tune the data storage and schemas accordingly. In the litera- ture, it is claimed that OLTP workloads are write-intensive, whereas OLAP- workloads are read-mostly and that the two workloads rely on “Opposing Laws of Database Physics” [Fre95]. Yet, data and query analysis of current enterprise systems showed that this statement is not true [KGZP10, KKG+11]. The main difference between systems that handle these query types is that OLTP systems handle more queries with a single select or queries that are highly selective returning only a few tuples, whereas OLAP systems calculate aggregations for only a few columns of a table, but for a large number of tuples. For the synchronization of the analytical system with the transactional system(s), a cost-intensive ETL (Extract-Transform-Load) process is required. The ETL process introduces a delay and is relatively complex, because all relevant changes have to be extracted from the outside source or sources if there are several, data is transformed to fit analytical needs, and it is loaded into the target database. 3.3 Drawbacks of the Separation of OLAP from OLTP While the separation of the database into two systems allows for workload specific optimizations in both systems, it also has a number of drawbacks: • The OLAP system does not have the latest data, because the ETL process introduces a delay. The delay can range from minutes to hours, or even days. Consequently, many decisions have to rely on stale data instead of using the latest information. • To achieve acceptable performance, OLAP systems work with predefined, materialized aggregates which reduce the query flexibility of the user. • Data redundancy is high. Similar information is stored in both systems, just differently optimized. • The schemas of the OLTP and OLAP systems are different, which intro- duces complexity for applications using both of them and for the ETL process synchronizing data between the systems.
  • 5. 3.5 Combining OLTP and OLAP data 19 3.4 The OLTP vs. OLAP Access Pattern Myth Transactional processing is often considered to have an equal share of read and write operations, while analytical processing is dominated by large reads and range queries. However, a workload analysis of multiple real customer systems reveals that OLTP and OLAP systems are not as different as expected in classic enterprise systems. As shown in [KKG+11], even OLTP systems process over 80% of read queries, and only a fraction of the workload contains write queries. Less than 10% of the actual workload are queries that modify existing data, e.g. updates and deletes. OLAP systems process an even larger amount of read queries, which make up about 95% of the workload. The updates in the transactional workload are of particular interest. An analysis of the updates in different industries is shown in Figure 3.1. It confirms that the number of updates in OLTP systems is quite low [KKG+11], and varies between industries. In the analyzed high-tech companies, the update rate peaks at about 12 %, meaning that about 88 % of all tuples saved in the transactional database are never updated. In other sectors, research showed even lower update rates, e.g., less than 1 % in banking and discrete manufacturing [KKG+11]. These results lead to the assumption that updates and deletes can be implemented by inserting tuples with timestamps to record the period of their validity. The additional benefit of this so called insert-only approach is that the complete transactional data history and a tuple’s life cycle are saved in the database automatically, avoiding the need for log-keeping in the application for auditing reasons. More details about the insert-only approach will be provided in Chapter 26. We can conclude that the characteristics of the two workload categories are not that different after all, which leads to the vision of reuniting the two systems and to combine OLTP and OLAP data in one system. 3.5 Combining OLTP and OLAP data The main benefit of the combination is that both, transactional and analytical queries can be executed on the same database using the same set of data as a “single source of truth”. Thereby, the costly ETL process becomes obsolete and all queries are performed against the latest version of the data. In this book we show that with the use of modern hardware we can elim- inate the need for pre-computed aggregates and materialized views. Data aggregation can be executed on-demand and analytical views can be pro- vided without delay. With the expected response time of analytical queries below one second, it is possible to perform analytical query processing on the transactional data directly, anytime and from any device. By dropping the
  • 6. 20 3 Enterprise Application Characteristics Industry Percentage.of.Updated.Rows.in.Financial.Accounting CPG 6% 6,2 High,Tech 12% 12,3 Logistics 4% 3,9 Discrete:M. 1% 0,9 Banking 1% 1 0%: 2%: 4%: 6%: 8%: 10%: 12%: 14%: CPG: High,Tech: LogisCcs: Discrete:M.: Banking: Percentage.of.Updated.Rows. Fig. 3.1: Updates of a Financial Application in different Industries pre-computation of aggregates and materialization of views, applications and data structures can be simplified. The management of aggregates and views (building, maintaining, and storing them) is not necessary any longer. The resulting mixed workload combines the characteristics of OLAP and OLTP workloads. A part of the queries in the workload performs typical transactional request like the selection of a few, complete rows. Others aggre- gate large amounts of the transactional data to generate real-time analytical reports on the latest data. Especially applications that inherently use access patterns from both workload groups and need access to the up-to-date data benefit greatly from fast access to large amounts of transactional data, e.g. dunning or planning applications. More application examples are given in Chapter 35. 3.6 Enterprise Data is Sparse Data By analyzing enterprise data in standard software, special data characteris- tics were identified. Most interestingly, most tables are very wide and contain hundreds of columns. However, many attributes of such table are not used at all: 55 % of all columns are unused on average per company. This is due to the fact, that standard software needs to support many workflows in dif- ferent industries and countries, however a single company never uses all of them. Further, in many columns NULL or default values are dominant, so the entropy (information containment) of these columns is very low (near zero). But even the columns that are used by a specific company often have a low cardinality of values, i.e., there are very few distinct values. Often due to the fact that the data models the real world, and every company has
  • 7. REFERENCES 21 only a limited number of products that can be sold, to a limited number of customers, by a limited number of employees and so on. These characteristics facilitate the efficient use of compression techniques that we will introduce in Chapter 7, leading to lower memory consumption and better query performance. 3.7 References [Fre95] Clark D. French. “One Size Fits All” Database Architectures Do Not Work for DSS. SIGMOD Rec., 24(2):449–450, May 1995. [KGZP10] J. Krueger, M. Grund, A. Zeier, and H. Plattner. Enterprise Application-specific Data Management. In EDOC, pages 131– 140, 2010. [KKG+11] J. Krueger, C. Kim, M. Grund, N. Satish, D. Schwalb, J. Chhugani, H. Plattner, P. Dubey, and A. Zeier. Fast Updates on Read- Optimized Databases Using Multi-Core CPUs. PVLDB, 2011.