SlideShare a Scribd company logo
1 of 25
Public
Column Oriented Databases
Arundhati Kanungo, Developer Associate, SAP Labs India Pvt. Ltd.
June 25, 2017
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 2Public
Agenda
• Definition of Column Oriented Databases
• History of Column Oriented Databases
• Working of Column Oriented Databases
• Top 3 Market Selling Column Oriented Databases
• Advantages of Column Oriented Databases
• Disadvantages of Column Oriented Databases
• Row vs Column Oriented Databases
• Industries to benefit from Column Oriented Databases
• Conclusion
• Future Work
Column Oriented Databases -
Definition
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 4Public
• Database management systems that store table data as columns of data rather than as rows
of data
• Have advantages for data warehousing, customer relationship management (CRM), and
library card catalogs, and other ad hoc enquiry where aggregates are computed over large
numbers of similar data items
• Refer to both the ease of expressing a column oriented structure and the focus on
optimizations for column oriented workloads
• Common examples include Greenplum Database, Calpont InfiniDB, Accumulo, Teradata,
SenSage, EXASOL, MonetDB, RCFile, Sqrrl, etc.
Column Oriented Databases -
History
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 6Public
• 1969
• With focus on information-retrieval in biology, the first application of a column oriented database storage
system called TAXIR was developed.
• 1976
• Statistics Canada implemented the RAPID system for processing and retrieval of the Canadian Census of
Population and Housing as well as several other statistical applications.
• 1977 - 1990
• The RAPID system was shared with other statistical organizations throughout the world for widespread use in
the 1980s. It was used by Statistics Canada until the 1990s.
• 1993
• KDB was launched as the first commercially available column oriented database.
• 1995
• Sybase IQ gained prominence as the second column oriented database.
• 2005
• The traditional column oriented databases have changed rapidly since 2005 with many open source and
commercial implementations.
Column Oriented Databases -
Working
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 8Public
EmpId LastName FirstName Salary
10 Smith Joe 40000
12 Jones Mary 50000
11 Johnson Cathy 44000
22 Jones Bob 55000
Ensuring minimum seeks, a column oriented database serializes all values of a column
together, then the values of the next column, and so on. For our example table, the data would
be stored as below.
10:001,12:002,11:003,22:004;Smith:001,Jones:002,Johnson:003,Jones:004;Joe:001,Mary:002,
Cathy:003,Bob:004;40000:001,50000:002,44000:003,55000:004;
A column-oriented store "is really just" a row-store with an index on every column.
Top 3 Column Oriented
Databases
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 10Public
• Sybase
• Aims to deliver high end performance for critical analytics, business intelligence and data warehousing solutions
leveraging highly optimized server dedicated for analytics.
• Is column oriented with grid-based architecture, patented data compression, and advanced query optimizer, and
henceforth, delivers high performance, flexibility, and economy in challenging reporting and analytics
environments.
• Infobright
• Combination of a column oriented database with their Knowledge Grid architecture delivers a self-managed,
scalable, high performance analytics query platform.
• Industry-leading data compression (10:1 up to 40:1) considerably lowers storage requirements and expensive
hardware infrastructures.
• Vertica
• Purpose-built platform to enable data values having high performance real time analytics needs.
• Highly scalable with a seamless integration ecosystem leveraging capabilities like data loading, queries, columnar
storage, MPP architecture, and data compression features.
Column Oriented Databases –
Advantages
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 12Public
• Scalability and fast data loading for Big Data
• Accessible by many third-party BI analytic tools
• Simple systems administration
• High performance on aggregation queries (like COUNT, SUM, AVG, MIN, MAX)
• Highly efficient data compression and/or partitioning
Column Oriented Databases –
Disadvantages
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 14Public
• Record updates and deletes reduce storage efficiency
• Effective partitioning/indexing schemes can be difficult to design
• Transactions are to be avoided or just not supported
• Queries with table joins can reduce high performance
Row vs Column Oriented
Databases
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 16Public
Comparative Analysis
Row Oriented Databases
• Relatively less efficient in terms of aggregate computation,
multiple updates, multiple select, single insert
• Well-suited for OLTP (On-Line Transaction Processing) like
workloads
Column Oriented Databases
• Highly efficient in terms of aggregate computation, multiple
updates, multiple select, single insert
• Well-suited for OLAP (On-Line Analytical Processing) like
workloads
Column Oriented Databases –
Benefited Industries
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 18Public
• Telecommunications: Helps in improving response time to the customer by reducing input
and output
• Financial Services: Supports high performance, millisecond response time to queries
required for inbound market
• Retail: Reads the data only referenced in question driving higher performance and lowering
processing costs compared to reading all the columns in the table
Conclusion
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 20Public
Column oriented database is a key technology that delivers high business value by helping
enterprises adapt their information infrastructure to the evolving demands for timely, reliable
intelligence to run the business. In addition, it has far-reaching implications for the design of
systems, and offers major cost savings affecting higher power and cooling requirements.
Future Work
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 22Public
Columnar databases can be very helpful in big data project. Big data, today is one of the
biggest problem ever faced. When we have volume and variety of random real-time data, we
might want to use a columnar database to exploit its flexibility, performance and scalability. Till
date, HBase is the only column oriented database that is used with big data. I look forward to
carry out a comparative analysis of HBase performance and other columnar database
performance when fed with big data.
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 23Public
References
1. Stonebraker et al., “C-Store: A Column-oriented DBMS”, Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005
2. Copeland, George P. and Khoshafian, Setrag N., “A Decomposition Storage Model”, SIGMOD '85, 1985
3. Daniel Abadi and Samuel Madden, "Debunking Another Myth: Column-Stores vs. Vertical Partitioning", The Database Column, 31 July 2008
4. Stavros Harizopoulos, Daniel Abadi and Peter Boncz, "Column-Oriented Database Systems", VLDB 2009 Tutorial, p. 5
5. Pat & Betty O’Neil, Xuedong Chen and Stephen Revilak, “The Star Schema Benchmark and Augmented Fact Table Indexing”, TPC Technology
Conference 8/24/09
6. D. J. Abadi, S. R. Madden, N. Hachem, “Column-stores vs. Row-stores: How Different are They Really?”, in: SIGMOD’08, 2008, pp. 967–980.
7. N. Bruno, “Teaching an old elephant new tricks”, in: CIDR ’09, 2009.
8. Daniel Lemire, Owen Kaser, Kamel Aouiche, "Sorting Improves Word-Aligned Bitmap Indexes", Data & Knowledge Engineering, Volume 69,
Issue 1 (2010), pp. 3-28.
9. Daniel Lemire and Owen Kaser, “Reordering Columns for Smaller Indexes”, Information Sciences 181 (12), 2011
10. Slezak et al., Brighthouse: “An Analytic Data Warehouse for Ad hoc Queries”, Proceedings of the 34th VLDB Conference, Auckland, New
Zealand, 2008
11. Estabrook, Brill, “The Theory of the TAXIR Accessioned”, Mathematical Biosciences, Volume 5, Issues 3–4, Elsevier B.V.
12. Turner, Hammond, Cotton, “A DBMS for Large Statistical Databases”, Proceedings of VLDB 1979, Rio de Janeiro, Brazil
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Thank you Contact information:
Arundhati Kanungo
Developer Associate
SAP Labs India Pvt. Ltd.
Arundhati.Kanungo@sap.com
+91 7406313166
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Any Questions???

More Related Content

What's hot

NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
Unidad 3 Modelamiento De Datos Conceptual
Unidad 3 Modelamiento De Datos ConceptualUnidad 3 Modelamiento De Datos Conceptual
Unidad 3 Modelamiento De Datos ConceptualSergio Sanchez
 
Relational vs Non Relational Databases
Relational vs Non Relational DatabasesRelational vs Non Relational Databases
Relational vs Non Relational DatabasesAngelica Lo Duca
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational DatabasesChris Baglieri
 
Visite guidée au pays de la donnée - Introduction et tour d'horizon
Visite guidée au pays de la donnée - Introduction et tour d'horizonVisite guidée au pays de la donnée - Introduction et tour d'horizon
Visite guidée au pays de la donnée - Introduction et tour d'horizonGautier Poupeau
 
NoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereNoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereEugene Hanikblum
 
Unidad 1. Fundamentos de Base de Datos
Unidad 1. Fundamentos de Base de DatosUnidad 1. Fundamentos de Base de Datos
Unidad 1. Fundamentos de Base de Datoshugodanielgd
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
Sql o NoSql en Informática Médica
Sql o NoSql en Informática MédicaSql o NoSql en Informática Médica
Sql o NoSql en Informática MédicaLiz Armenteros
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 
Database Management System - SQL Advanced Training
Database Management System - SQL Advanced TrainingDatabase Management System - SQL Advanced Training
Database Management System - SQL Advanced TrainingMoutasm Tamimi
 
Database Fundamental
Database FundamentalDatabase Fundamental
Database FundamentalGong Haibing
 
Analytic & Windowing functions in oracle
Analytic & Windowing functions in oracleAnalytic & Windowing functions in oracle
Analytic & Windowing functions in oracleLogan Palanisamy
 

What's hot (20)

NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
MongoDB
MongoDBMongoDB
MongoDB
 
Unidad 3 Modelamiento De Datos Conceptual
Unidad 3 Modelamiento De Datos ConceptualUnidad 3 Modelamiento De Datos Conceptual
Unidad 3 Modelamiento De Datos Conceptual
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Relational vs Non Relational Databases
Relational vs Non Relational DatabasesRelational vs Non Relational Databases
Relational vs Non Relational Databases
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
Visite guidée au pays de la donnée - Introduction et tour d'horizon
Visite guidée au pays de la donnée - Introduction et tour d'horizonVisite guidée au pays de la donnée - Introduction et tour d'horizon
Visite guidée au pays de la donnée - Introduction et tour d'horizon
 
NoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereNoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and Where
 
Unidad 1. Fundamentos de Base de Datos
Unidad 1. Fundamentos de Base de DatosUnidad 1. Fundamentos de Base de Datos
Unidad 1. Fundamentos de Base de Datos
 
Graph databases
Graph databasesGraph databases
Graph databases
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Sql o NoSql en Informática Médica
Sql o NoSql en Informática MédicaSql o NoSql en Informática Médica
Sql o NoSql en Informática Médica
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
Database Management System - SQL Advanced Training
Database Management System - SQL Advanced TrainingDatabase Management System - SQL Advanced Training
Database Management System - SQL Advanced Training
 
Presentacion BD NoSQL
Presentacion  BD NoSQLPresentacion  BD NoSQL
Presentacion BD NoSQL
 
Database Fundamental
Database FundamentalDatabase Fundamental
Database Fundamental
 
NoSQL - MongoDB
NoSQL - MongoDBNoSQL - MongoDB
NoSQL - MongoDB
 
Analytic & Windowing functions in oracle
Analytic & Windowing functions in oracleAnalytic & Windowing functions in oracle
Analytic & Windowing functions in oracle
 

Similar to Column Oriented Databases

Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data SolutionJames Serra
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...DataWorks Summit/Hadoop Summit
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSijcsit
 
CIO Guide to Using SAP HANA Platform For Big Data
CIO Guide to Using SAP HANA Platform For Big DataCIO Guide to Using SAP HANA Platform For Big Data
CIO Guide to Using SAP HANA Platform For Big DataSnehanshu Shah
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopDavid Yahalom
 
How to Empower Your Business Users with Oracle Data Visualization
How to Empower Your Business Users with Oracle Data VisualizationHow to Empower Your Business Users with Oracle Data Visualization
How to Empower Your Business Users with Oracle Data VisualizationPerficient, Inc.
 
Enable Better Decision Making with Power BI Visualizations & Modern Data Estate
Enable Better Decision Making with Power BI Visualizations & Modern Data EstateEnable Better Decision Making with Power BI Visualizations & Modern Data Estate
Enable Better Decision Making with Power BI Visualizations & Modern Data EstateCCG
 
Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10Harsha Gowda B R
 
Total Data Industry Report
Total Data Industry ReportTotal Data Industry Report
Total Data Industry ReportRan Zhang
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Cambridge Semantics
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxAIMLSEMINARS
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 

Similar to Column Oriented Databases (20)

DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
SoftServe BI/BigData Workshop in Utah
SoftServe BI/BigData Workshop in UtahSoftServe BI/BigData Workshop in Utah
SoftServe BI/BigData Workshop in Utah
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data Solution
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
 
Query Optimization for Big Data Analytics
Query Optimization for Big Data AnalyticsQuery Optimization for Big Data Analytics
Query Optimization for Big Data Analytics
 
CIO Guide to Using SAP HANA Platform For Big Data
CIO Guide to Using SAP HANA Platform For Big DataCIO Guide to Using SAP HANA Platform For Big Data
CIO Guide to Using SAP HANA Platform For Big Data
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera Hadoop
 
How to Empower Your Business Users with Oracle Data Visualization
How to Empower Your Business Users with Oracle Data VisualizationHow to Empower Your Business Users with Oracle Data Visualization
How to Empower Your Business Users with Oracle Data Visualization
 
Enable Better Decision Making with Power BI Visualizations & Modern Data Estate
Enable Better Decision Making with Power BI Visualizations & Modern Data EstateEnable Better Decision Making with Power BI Visualizations & Modern Data Estate
Enable Better Decision Making with Power BI Visualizations & Modern Data Estate
 
Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10
 
Total Data Industry Report
Total Data Industry ReportTotal Data Industry Report
Total Data Industry Report
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 

Recently uploaded

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 

Recently uploaded (20)

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 

Column Oriented Databases

  • 1. Public Column Oriented Databases Arundhati Kanungo, Developer Associate, SAP Labs India Pvt. Ltd. June 25, 2017
  • 2. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 2Public Agenda • Definition of Column Oriented Databases • History of Column Oriented Databases • Working of Column Oriented Databases • Top 3 Market Selling Column Oriented Databases • Advantages of Column Oriented Databases • Disadvantages of Column Oriented Databases • Row vs Column Oriented Databases • Industries to benefit from Column Oriented Databases • Conclusion • Future Work
  • 4. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 4Public • Database management systems that store table data as columns of data rather than as rows of data • Have advantages for data warehousing, customer relationship management (CRM), and library card catalogs, and other ad hoc enquiry where aggregates are computed over large numbers of similar data items • Refer to both the ease of expressing a column oriented structure and the focus on optimizations for column oriented workloads • Common examples include Greenplum Database, Calpont InfiniDB, Accumulo, Teradata, SenSage, EXASOL, MonetDB, RCFile, Sqrrl, etc.
  • 6. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 6Public • 1969 • With focus on information-retrieval in biology, the first application of a column oriented database storage system called TAXIR was developed. • 1976 • Statistics Canada implemented the RAPID system for processing and retrieval of the Canadian Census of Population and Housing as well as several other statistical applications. • 1977 - 1990 • The RAPID system was shared with other statistical organizations throughout the world for widespread use in the 1980s. It was used by Statistics Canada until the 1990s. • 1993 • KDB was launched as the first commercially available column oriented database. • 1995 • Sybase IQ gained prominence as the second column oriented database. • 2005 • The traditional column oriented databases have changed rapidly since 2005 with many open source and commercial implementations.
  • 8. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 8Public EmpId LastName FirstName Salary 10 Smith Joe 40000 12 Jones Mary 50000 11 Johnson Cathy 44000 22 Jones Bob 55000 Ensuring minimum seeks, a column oriented database serializes all values of a column together, then the values of the next column, and so on. For our example table, the data would be stored as below. 10:001,12:002,11:003,22:004;Smith:001,Jones:002,Johnson:003,Jones:004;Joe:001,Mary:002, Cathy:003,Bob:004;40000:001,50000:002,44000:003,55000:004; A column-oriented store "is really just" a row-store with an index on every column.
  • 9. Top 3 Column Oriented Databases
  • 10. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 10Public • Sybase • Aims to deliver high end performance for critical analytics, business intelligence and data warehousing solutions leveraging highly optimized server dedicated for analytics. • Is column oriented with grid-based architecture, patented data compression, and advanced query optimizer, and henceforth, delivers high performance, flexibility, and economy in challenging reporting and analytics environments. • Infobright • Combination of a column oriented database with their Knowledge Grid architecture delivers a self-managed, scalable, high performance analytics query platform. • Industry-leading data compression (10:1 up to 40:1) considerably lowers storage requirements and expensive hardware infrastructures. • Vertica • Purpose-built platform to enable data values having high performance real time analytics needs. • Highly scalable with a seamless integration ecosystem leveraging capabilities like data loading, queries, columnar storage, MPP architecture, and data compression features.
  • 11. Column Oriented Databases – Advantages
  • 12. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 12Public • Scalability and fast data loading for Big Data • Accessible by many third-party BI analytic tools • Simple systems administration • High performance on aggregation queries (like COUNT, SUM, AVG, MIN, MAX) • Highly efficient data compression and/or partitioning
  • 13. Column Oriented Databases – Disadvantages
  • 14. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 14Public • Record updates and deletes reduce storage efficiency • Effective partitioning/indexing schemes can be difficult to design • Transactions are to be avoided or just not supported • Queries with table joins can reduce high performance
  • 15. Row vs Column Oriented Databases
  • 16. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 16Public Comparative Analysis Row Oriented Databases • Relatively less efficient in terms of aggregate computation, multiple updates, multiple select, single insert • Well-suited for OLTP (On-Line Transaction Processing) like workloads Column Oriented Databases • Highly efficient in terms of aggregate computation, multiple updates, multiple select, single insert • Well-suited for OLAP (On-Line Analytical Processing) like workloads
  • 17. Column Oriented Databases – Benefited Industries
  • 18. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 18Public • Telecommunications: Helps in improving response time to the customer by reducing input and output • Financial Services: Supports high performance, millisecond response time to queries required for inbound market • Retail: Reads the data only referenced in question driving higher performance and lowering processing costs compared to reading all the columns in the table
  • 20. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 20Public Column oriented database is a key technology that delivers high business value by helping enterprises adapt their information infrastructure to the evolving demands for timely, reliable intelligence to run the business. In addition, it has far-reaching implications for the design of systems, and offers major cost savings affecting higher power and cooling requirements.
  • 22. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 22Public Columnar databases can be very helpful in big data project. Big data, today is one of the biggest problem ever faced. When we have volume and variety of random real-time data, we might want to use a columnar database to exploit its flexibility, performance and scalability. Till date, HBase is the only column oriented database that is used with big data. I look forward to carry out a comparative analysis of HBase performance and other columnar database performance when fed with big data.
  • 23. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 23Public References 1. Stonebraker et al., “C-Store: A Column-oriented DBMS”, Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005 2. Copeland, George P. and Khoshafian, Setrag N., “A Decomposition Storage Model”, SIGMOD '85, 1985 3. Daniel Abadi and Samuel Madden, "Debunking Another Myth: Column-Stores vs. Vertical Partitioning", The Database Column, 31 July 2008 4. Stavros Harizopoulos, Daniel Abadi and Peter Boncz, "Column-Oriented Database Systems", VLDB 2009 Tutorial, p. 5 5. Pat & Betty O’Neil, Xuedong Chen and Stephen Revilak, “The Star Schema Benchmark and Augmented Fact Table Indexing”, TPC Technology Conference 8/24/09 6. D. J. Abadi, S. R. Madden, N. Hachem, “Column-stores vs. Row-stores: How Different are They Really?”, in: SIGMOD’08, 2008, pp. 967–980. 7. N. Bruno, “Teaching an old elephant new tricks”, in: CIDR ’09, 2009. 8. Daniel Lemire, Owen Kaser, Kamel Aouiche, "Sorting Improves Word-Aligned Bitmap Indexes", Data & Knowledge Engineering, Volume 69, Issue 1 (2010), pp. 3-28. 9. Daniel Lemire and Owen Kaser, “Reordering Columns for Smaller Indexes”, Information Sciences 181 (12), 2011 10. Slezak et al., Brighthouse: “An Analytic Data Warehouse for Ad hoc Queries”, Proceedings of the 34th VLDB Conference, Auckland, New Zealand, 2008 11. Estabrook, Brill, “The Theory of the TAXIR Accessioned”, Mathematical Biosciences, Volume 5, Issues 3–4, Elsevier B.V. 12. Turner, Hammond, Cotton, “A DBMS for Large Statistical Databases”, Proceedings of VLDB 1979, Rio de Janeiro, Brazil
  • 24. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Thank you Contact information: Arundhati Kanungo Developer Associate SAP Labs India Pvt. Ltd. Arundhati.Kanungo@sap.com +91 7406313166
  • 25. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Any Questions???