Columnar databases store data by column rather than by row. This allows for more efficient reading and writing to disk storage, speeding up query returns. Columnar databases are particularly useful for data analytics and warehousing. They store each column of data together, rather than keeping all columns of a record in a single row. Benefits include improved performance for data analytics, business intelligence, and warehousing applications due to their structure and compression abilities. However, they are not as well suited as traditional row-oriented databases for incremental data loading or online transaction processing due to different design priorities.
1. ARCHITECTURE USED IN COLUMN
FAMILY DATABASES
Presented by
M.Vidhya
I-M.Sc (CS)
Nadar Saraswathi College of Arts and
Science
2. What is a columnar database?
A columnar database is a database management
system (DBMS) that stores data in columns instead
of rows. The purpose of a columnar database is to
efficiently write and read data to and from hard
disk storage in order to speed up the time it takes
to return a query. Columnar databases store data in
a way that greatly improves disk I/O performance.
They are particularly helpful for data analytics and
data warehousing.
3. Columnar database vs. row-oriented database
Column-oriented databases and row-oriented
databases are both methods for processing data
in data warehouses. However, they have different
approaches: While column-oriented databases
store data in columns, row-oriented databases
store data in rows. Instead of keeping a record of
every column in a table in a single row, a
column-oriented database will store the data for
each column in a single column.
4.
5. Columnar database example
In a columnar database, all the values in a column are
physically grouped together. For example, all the
values in column 1 are grouped together; then all
values in column 2 are grouped together.The data is
stored in record order, so the 100th entry for column
1 and the 100th entry for column 2 belong to the
same input record. This enables individual data
elements, such as customer name to be accessed in
columns as a group, rather than individually row-by-
row.
6. Account
number
Last name First name
Purchase
(in dollars)
0411 Moriarty Angela 52.35
0412 Richards Jason 325.82
0413 Diamond Samantha 25.5
Here is an example of a simple database table with four columns and three rows.
7. Benefits of using a columnar database
Columnar databases have been around for decades
but offer benefits for modern business
applications, such as data analytics, business
intelligence (BI) and data warehousing -- but that's
not all. Here are three key advantages of columnar
databases:Multipurpose. Columnar databases
receive a lot of attention with big data applications.
They're also used for other purposes: running
online analytical processing (OLAP) cubes, storing
metadata and doing real-time analytics.
8. Columnar database limitations
Traditional databases are more suitable for
incremental data loading than columnar databases.
Incremental data loading is a technique that
implements a bulk data load into a database by
loading only a subset of the data.
The data is loaded according to a trigger, which is a
point where the data can be loaded more efficiently.
An example of a trigger is when another user adds
data or when a certain time of the day occurs.
9. Online transaction processing (OLTP)
applications are also not suitable in column-
oriented databases. Row-oriented databases
work better for OLTP applications because
they have better concurrent processing and
isolation capabilities, and they use disk space
more efficiently.