SlideShare a Scribd company logo
1 of 11
QlikView Data Storage
Key Objectives
• How QlikView stores data
• How to determine memory savings to be gained from
implementing structure changes
• The importance of load order
QlikView Data Storage
• Storage in a QlikView document is a two level
organization. The first level stores distinct lists of
values. The second level stores table data consisting
of pointers
Storage Level 1
There is exactly one list for each field included in the
document, i.e. if you have 25 fields in your document, you
will have 25 lists. As data is loaded during the reload
process, the lists are populated. The order of the values,
therefore, reflects the load order of the data
Storage Level 2
The second level of storage in a QlikView document
consists of tables containing pointer values (or NULL) in
each cell. These pointer values correspond to the “index” of
the appropriate value in the distinct lists located in the first
level storage.
Storage Level 1
Storage Level 2
Memory Calculation Example
• Consider a key field consisting of potentially 550,000 distinct values in a
fact table containing 100 million rows of data.
• To calculate the storage required for this field, we need to consider the
list storage as well as the table pointer storage.
• Remember that these are minimum amounts, since there is, of course,
additional storage that QlikView will require to maintain the data.
• For the list data, let's assume that each value is a 10 byte character
(text) representation.
• If we simply multiple 10 bytes by 550,000, we will get to approximately
5.5 MB. This will be the requirement to store the list, regardless of the
number of times this field is referenced in logical tables.
• Next, we need to calculate the pointer requirements for the logical table
including this field. Since 550,000 distinct values can be represented by
20 bits (2 to the 20th power), we will need a pointer size of 2.5 bytes for
each row of non-null data.
Memory Calculation Example
• As mentioned above, we have 100 million rows of data in our fact
table. Since 100 million times 2.5 is 250 million, we will need
approximately 250 MB to store the pointers for this field in the logical
table.
• So, total storage required for this single field in one logical table
would be approximately 256 MB. Since this is a key field, by
definition, it also exists in one or more additional logical tables. If we
assume a dimension table with a single row of data for each key
value, then we need to add an additional 1.4 MB for the dimension
table pointer storage. This brings our total to about 258 MB for this
single field, almost all of which is pointer storage.
• Of course, in documents with small fact tables (less than 1 million
rows), this is not as much of an issue. But when dealing with large
fact tables, you need to be aware of the repercussions of adding
and deleting fields
Why is Memory Important?
• Memory usage is just one aspect of the performance of a
QlikView document. In a 64-bit environment, memory seems
to be limitless.
• But even 64-bit systems have a limited amount of RAM
installed, and most systems will live with that amount for quite
awhile. It is safe to assume that data, users, and - in the case
of QlikView Server - documents will continue to expand.
• The more memory taken up needlessly means the less
memory available for expansion - both fixed and dynamic.
• So it is important to understand what your thresholds are, and
that once a threshold is breached, users will generally suffer.
Thank You

More Related Content

What's hot

Ronalao termpresent
Ronalao termpresentRonalao termpresent
Ronalao termpresent
Elma Belitz
 
Data Step Hash Object vs SQL Join
Data Step Hash Object vs SQL JoinData Step Hash Object vs SQL Join
Data Step Hash Object vs SQL Join
Geoff Ness
 
Datomic rtree-pres
Datomic rtree-presDatomic rtree-pres
Datomic rtree-pres
jsofra
 

What's hot (19)

Ronalao termpresent
Ronalao termpresentRonalao termpresent
Ronalao termpresent
 
spark stream - kafka - the right way
spark stream - kafka - the right way spark stream - kafka - the right way
spark stream - kafka - the right way
 
Large scale sql server best practices
Large scale sql server   best practicesLarge scale sql server   best practices
Large scale sql server best practices
 
Strip your TEXT fields - Exeter Web Feb/2016
Strip your TEXT fields - Exeter Web Feb/2016Strip your TEXT fields - Exeter Web Feb/2016
Strip your TEXT fields - Exeter Web Feb/2016
 
Strip your TEXT fields
Strip your TEXT fieldsStrip your TEXT fields
Strip your TEXT fields
 
Data Step Hash Object vs SQL Join
Data Step Hash Object vs SQL JoinData Step Hash Object vs SQL Join
Data Step Hash Object vs SQL Join
 
InfluxDB Internals
InfluxDB InternalsInfluxDB Internals
InfluxDB Internals
 
Lsm trees
Lsm treesLsm trees
Lsm trees
 
Lsm
LsmLsm
Lsm
 
Hadoop at datasift
Hadoop at datasiftHadoop at datasift
Hadoop at datasift
 
MapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open IssuesMapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open Issues
 
Hadoop at datasift
Hadoop at datasiftHadoop at datasift
Hadoop at datasift
 
Data Warehouse Offload
Data Warehouse OffloadData Warehouse Offload
Data Warehouse Offload
 
Datomic rtree-pres
Datomic rtree-presDatomic rtree-pres
Datomic rtree-pres
 
Managing Multi-DBMS on a Single UI , a Web-based Spatial DB Manager-FOSS4G A...
Managing Multi-DBMS on a Single UI, a Web-based Spatial DB Manager-FOSS4G A...Managing Multi-DBMS on a Single UI, a Web-based Spatial DB Manager-FOSS4G A...
Managing Multi-DBMS on a Single UI , a Web-based Spatial DB Manager-FOSS4G A...
 
Dynamic Columns of Phoenix for SQL on Sparse(NoSql) Data
Dynamic Columns of Phoenix for SQL on Sparse(NoSql) DataDynamic Columns of Phoenix for SQL on Sparse(NoSql) Data
Dynamic Columns of Phoenix for SQL on Sparse(NoSql) Data
 
Steam Learn: Introduction to RDBMS indexes
Steam Learn: Introduction to RDBMS indexesSteam Learn: Introduction to RDBMS indexes
Steam Learn: Introduction to RDBMS indexes
 
Paging examples
Paging examplesPaging examples
Paging examples
 
Google mesa
Google mesaGoogle mesa
Google mesa
 

Viewers also liked

KPerry - 20463 Implementing a Data Warehouse with Microsoft® SQL Server (2)
KPerry - 20463 Implementing a Data Warehouse with Microsoft® SQL Server (2)KPerry - 20463 Implementing a Data Warehouse with Microsoft® SQL Server (2)
KPerry - 20463 Implementing a Data Warehouse with Microsoft® SQL Server (2)
Kwame M. Perry
 
Different ways to load data in qlikview
Different ways to load data in qlikviewDifferent ways to load data in qlikview
Different ways to load data in qlikview
Swamy Danthuri
 
Qlik View Corporate Overview Ppt Presentation
Qlik View Corporate Overview Ppt PresentationQlik View Corporate Overview Ppt Presentation
Qlik View Corporate Overview Ppt Presentation
pdalalau
 

Viewers also liked (18)

Llorance New Horizons 20768 Developing SQL Data Models
Llorance New Horizons 20768 Developing SQL Data ModelsLlorance New Horizons 20768 Developing SQL Data Models
Llorance New Horizons 20768 Developing SQL Data Models
 
KPerry - 20463 Implementing a Data Warehouse with Microsoft® SQL Server (2)
KPerry - 20463 Implementing a Data Warehouse with Microsoft® SQL Server (2)KPerry - 20463 Implementing a Data Warehouse with Microsoft® SQL Server (2)
KPerry - 20463 Implementing a Data Warehouse with Microsoft® SQL Server (2)
 
TIQ Solutions - QlikView Data Integration in a Java World
TIQ Solutions - QlikView Data Integration in a Java WorldTIQ Solutions - QlikView Data Integration in a Java World
TIQ Solutions - QlikView Data Integration in a Java World
 
Different ways to load data in qlikview
Different ways to load data in qlikviewDifferent ways to load data in qlikview
Different ways to load data in qlikview
 
Microsoft SQL Server PowerPivot
Microsoft SQL Server PowerPivotMicrosoft SQL Server PowerPivot
Microsoft SQL Server PowerPivot
 
BI & Analytics in Action Using QlikView
BI & Analytics in Action Using QlikViewBI & Analytics in Action Using QlikView
BI & Analytics in Action Using QlikView
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
 
Practical qlikview 25 page sample
Practical qlikview   25 page samplePractical qlikview   25 page sample
Practical qlikview 25 page sample
 
Microsoft Azure - SQL Data Warehouse
Microsoft Azure - SQL Data WarehouseMicrosoft Azure - SQL Data Warehouse
Microsoft Azure - SQL Data Warehouse
 
SQL - Parallel Data Warehouse (PDW)
SQL - Parallel Data Warehouse (PDW)SQL - Parallel Data Warehouse (PDW)
SQL - Parallel Data Warehouse (PDW)
 
Qlik View Corporate Overview Ppt Presentation
Qlik View Corporate Overview Ppt PresentationQlik View Corporate Overview Ppt Presentation
Qlik View Corporate Overview Ppt Presentation
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Best Practices - QlikView Application Development
Best Practices - QlikView Application DevelopmentBest Practices - QlikView Application Development
Best Practices - QlikView Application Development
 
What is QlikView and What makes it Unique
What is QlikView and What makes it UniqueWhat is QlikView and What makes it Unique
What is QlikView and What makes it Unique
 
QlikView Architecture Overview
QlikView Architecture OverviewQlikView Architecture Overview
QlikView Architecture Overview
 
Microsoft PowerPivot & Power View in Excel 2013
Microsoft PowerPivot & Power View in Excel 2013Microsoft PowerPivot & Power View in Excel 2013
Microsoft PowerPivot & Power View in Excel 2013
 
QlikView & Big Data
QlikView & Big DataQlikView & Big Data
QlikView & Big Data
 
Dbms models
Dbms modelsDbms models
Dbms models
 

Similar to V4 qlik view-datastorage

AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
Volodymyr Rovetskiy
 

Similar to V4 qlik view-datastorage (20)

A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
 
A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
A Common Database Approach for OLTP and OLAP Using an In-Memory Column DatabaseA Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
 
Amazon elastic block store (ebs) and
Amazon elastic block store (ebs) andAmazon elastic block store (ebs) and
Amazon elastic block store (ebs) and
 
Open Source Datawarehouse
Open Source DatawarehouseOpen Source Datawarehouse
Open Source Datawarehouse
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 
8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databases
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
 
Bigtable
Bigtable Bigtable
Bigtable
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Use a data parallel approach to proAcess
Use a data parallel approach to proAcessUse a data parallel approach to proAcess
Use a data parallel approach to proAcess
 
Bigtable_Paper
Bigtable_PaperBigtable_Paper
Bigtable_Paper
 
database.pdf
database.pdfdatabase.pdf
database.pdf
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
Cosequential processing and the sorting of large files
Cosequential processing and the sorting of large filesCosequential processing and the sorting of large files
Cosequential processing and the sorting of large files
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
 
Lecture3.ppt
Lecture3.pptLecture3.ppt
Lecture3.ppt
 
Indexing
IndexingIndexing
Indexing
 
Save 60% of Kubernetes storage costs on AWS & others with OpenEBS
Save 60% of Kubernetes storage costs on AWS & others with OpenEBSSave 60% of Kubernetes storage costs on AWS & others with OpenEBS
Save 60% of Kubernetes storage costs on AWS & others with OpenEBS
 

V4 qlik view-datastorage

  • 2. Key Objectives • How QlikView stores data • How to determine memory savings to be gained from implementing structure changes • The importance of load order
  • 3. QlikView Data Storage • Storage in a QlikView document is a two level organization. The first level stores distinct lists of values. The second level stores table data consisting of pointers
  • 4. Storage Level 1 There is exactly one list for each field included in the document, i.e. if you have 25 fields in your document, you will have 25 lists. As data is loaded during the reload process, the lists are populated. The order of the values, therefore, reflects the load order of the data
  • 5. Storage Level 2 The second level of storage in a QlikView document consists of tables containing pointer values (or NULL) in each cell. These pointer values correspond to the “index” of the appropriate value in the distinct lists located in the first level storage.
  • 8. Memory Calculation Example • Consider a key field consisting of potentially 550,000 distinct values in a fact table containing 100 million rows of data. • To calculate the storage required for this field, we need to consider the list storage as well as the table pointer storage. • Remember that these are minimum amounts, since there is, of course, additional storage that QlikView will require to maintain the data. • For the list data, let's assume that each value is a 10 byte character (text) representation. • If we simply multiple 10 bytes by 550,000, we will get to approximately 5.5 MB. This will be the requirement to store the list, regardless of the number of times this field is referenced in logical tables. • Next, we need to calculate the pointer requirements for the logical table including this field. Since 550,000 distinct values can be represented by 20 bits (2 to the 20th power), we will need a pointer size of 2.5 bytes for each row of non-null data.
  • 9. Memory Calculation Example • As mentioned above, we have 100 million rows of data in our fact table. Since 100 million times 2.5 is 250 million, we will need approximately 250 MB to store the pointers for this field in the logical table. • So, total storage required for this single field in one logical table would be approximately 256 MB. Since this is a key field, by definition, it also exists in one or more additional logical tables. If we assume a dimension table with a single row of data for each key value, then we need to add an additional 1.4 MB for the dimension table pointer storage. This brings our total to about 258 MB for this single field, almost all of which is pointer storage. • Of course, in documents with small fact tables (less than 1 million rows), this is not as much of an issue. But when dealing with large fact tables, you need to be aware of the repercussions of adding and deleting fields
  • 10. Why is Memory Important? • Memory usage is just one aspect of the performance of a QlikView document. In a 64-bit environment, memory seems to be limitless. • But even 64-bit systems have a limited amount of RAM installed, and most systems will live with that amount for quite awhile. It is safe to assume that data, users, and - in the case of QlikView Server - documents will continue to expand. • The more memory taken up needlessly means the less memory available for expansion - both fixed and dynamic. • So it is important to understand what your thresholds are, and that once a threshold is breached, users will generally suffer.