SlideShare a Scribd company logo
1 of 34
Download to read offline
Column-oriented databases
        NoSQL Cologne UG
           07. 11. 2012

      Jan Steemann (triAGENS)

        twitter: @steemann




                 © 2012 triAGENS GmbH | 2012-11-07   1
Types of data stores
Key / value stores (opaque / typed)   Document stores (non-shaped / shaped)
                                         collection
        key             value                    key               document
                                                                     value

        key             value                    key               document

  ...                                     ...

                                                 Relational databases
                                         table

                                                 key                    value
                                                                         row




                                                                                   column
                                                                    column
                                                 key                         row

                                          ...



                                         © 2012 triAGENS GmbH | 2012-11-07                  2
Key / value stores (opaque)
   Keys are mapped to values
   Values are treated as BLOBs (opaque data)
   No type information is stored
   Values can be heterogenous


      key               value

      key               value


Example values:
 { name: „foo“, age: 25, city: „bar“ }   => JSON, but store will not care about it
 xdexadxb0x0b                         => binary, but store will not care about it




                                                © 2012 triAGENS GmbH | 2012-11-07        3
Key / value stores (typed)
   Keys are mapped to values
   Values have simple type information attached
   Type information is stored per value
   Values can still be heterogenous


      key             type   value

      key             type   value


Example values:
 number: 25                 => numeric, store can do something with it
 list: [ 1, 2, 3 ]          => list, store can do something with it




                                                © 2012 triAGENS GmbH | 2012-11-07   4
Document stores (non-shaped)
   Keys are mapped to documents
   Documents consist of attributes
   Attributes are name/typed value pairs, which may be nested
   Type information is stored per attribute
   Documents can be heterogenous
   Documents may be organised in collections or databases

                     document
      key               name      type     value           name        type          value

                     document
      key
                        name      type    name type value          name type value


Example documents:
 { name: „foo“, age: 25, city: „bar“ }            => attributes and sub-attributes
 { name: { first: „foo“, last: „bar“ }, age: 25 }    are typed and can be indexed


                                                 © 2012 triAGENS GmbH | 2012-11-07           5
Document stores (shaped)
   Same as document stores, but...
   ...document type information is stored in shapes
   ...documents with similar structure (attribute names and types) point to the same shape




                     document                                       shape
       key            shape     value           value               type name type name

                     document                                       shape
       key
                      shape    value    value    value              type

                     document
       key
                      shape    value    value    value




                                                   © 2012 triAGENS GmbH | 2012-11-07          6
Relational databases
   Tables (relations) consist of rows and columns
   Columns have a type. Type information is stored once per column
   A rows contains just values for a record (no type information)
   All rows in a table have the same columns and are homogenous
                    table
                     column type column type column type column type column type
                      row
      key               value        value         value            value          value

                      row
      key               value        value         value            value          value


Example rows:
 „foo“, „bar“, 25, 35.63

 „bar“, „baz“, 42, -673.342




                                               © 2012 triAGENS GmbH | 2012-11-07           7
Data stores, summary

   Documents in document stores can be heterogenous:
    Different documents can have different attributes and types
    Type information is stored per document or group of documents




                                                                           Homogenity
    Values in key / value stores can be heterogenous:




                                                                                            Flexibility


    Different values can have different types
    Type information is either not stored at all or stored per value
   Rows in relational databases are homogenous:
    Different rows have the same columns and types
    Type information is stored once per column


   Which data store type is ideal for the job depends on
    the types of queries that must be run on the data!

                                       © 2012 triAGENS GmbH | 2012-11-07                8
Query types: OLTP

   „transactional“ processing
   retrieve or modify individual records (mostly few records)
   use indexes to quickly find relevant records
   queries often triggered by end user actions and should complete
    instantly
   ACID properties may be important
   mixed read/write workload
   working set should fit in RAM




                                      © 2012 triAGENS GmbH | 2012-11-07   9
Query types: OLAP

   analytical processing / reporting
   derive new information from existing data
    (aggregates, transformations, calculations)
   queries often run on many records or complete data set
   data set may exceed size of RAM easily
   mainly read or even read-only workload
   ACID properties often not important, data can often be regenerated
   queries often run interactively
   common: not known in advance which aspects are interesting
    so pre-indexing „relevant“ columns is difficult


                                        © 2012 triAGENS GmbH | 2012-11-07   10
OLTP vs. OLAP

   properties of OLTP and OLAP queries are very different
   most data stores are specialised only for one class of queries
   where to run which type of queries?

           OLTP                                         Key / value store



                             ???                        Document store


           OLAP                                       Relational database




                                      © 2012 triAGENS GmbH | 2012-11-07     11
OLAP in relational databases

   Data in relational databases is fully typed
   Nested values can be simulated using auxilliary tables, joins etc.
   Most relational databases offer SQL which allows complex
    analysis queries
   Type overhead in relational databases is very low, allowing fast
    processing of data


   Overhead for (often unneeded) ACID properties may produce a
    signifcant performance penalty




                                       © 2012 triAGENS GmbH | 2012-11-07   12
Row vs. columnar relational databases

   All relational databases deal with tables, rows, and columns
   But there are sub-types:
       row-oriented: they are internally organised around the handling of rows
       columnar / column-oriented: these mainly work with columns
   Both types usually offer SQL interfaces and produce tables
    (with rows and columns) as their result sets
   Both types can generally solve the same queries
   Both types have specific use cases that they're good for
    (and use cases that they're not good for)




                                              © 2012 triAGENS GmbH | 2012-11-07   13
Row vs. columnar relational databases

   In practice, row-oriented databases are often optimised and
    particuarly good for OLTP workloads
   whereas column-oriented databases are often well-suited for
    OLAP workloads
   this is due to the different internal designs of row- and
    column-oriented databases




                                       © 2012 triAGENS GmbH | 2012-11-07   14
Row-oriented storage

   In row-oriented databases, row value data is usually stored
    contiguously:

    row0 header column0 value column1 value column2 value column3 value
    row1 header column0 value column1 value column2 value column3 value
    row2 header column0 value column1 value column2 value column3 value



    (the row headers contain record lengths, NULL bits etc.)




                                         © 2012 triAGENS GmbH | 2012-11-07   15
Row-oriented storage

   When looking at a table's datafile, it could look like this:
    header   values   header   values   header       values        header     values

                                    filesize


   Actual row values are stored at specific offsets of the values struct:
             values            values            values                     values
    header            header            header                     header
           c0 c1 c2          c0 c1 c2          c0 c1 c2                   c0 c1 c2

   Offsets depend on column types, e.g. 4 for int32, 8 for int64 etc.




                                          © 2012 triAGENS GmbH | 2012-11-07            16
Row-oriented storage

   To read a specific row, we need to determine its position first
   This is easy if rows have a fixed length:
       position = row number * row length + header length
       header length = 8
       row length = header length (8) + value length (8) = 16
       value length = c0 length (2) + c1 length (2) + c2 length (4) = 8

    row 0                   row 1                   row 2
                 values               values                  values
        header               header                  header
               c0 c1 c2             c0 c1 c2                c0 c1 c2
         8     2 2    4         8     2 2    4           8       2 2       4
               16                    16                         16



                                                 © 2012 triAGENS GmbH | 2012-11-07   17
Row-oriented storage

   In reality its often not that easy
   With varchars, values can have variable lengths

    header             values              header         values         header values

                                    filesize

   With variable length values, we cannot simply jump in the middle of
    the datafile because we don't know where a row starts
   To calculate the offset of a specifc row, we need to use the record
    lengths stored in the headers. This means traversing headers!



                                          © 2012 triAGENS GmbH | 2012-11-07              18
Row-oriented storage

   Another strategy is to split varchar values into two parts:
       prefixes of fixed-length are stored in-row
       excess data exceeding the prefix length are stored off-row,
        and are pointed to from the row
   This allows values to have fixed lengths again, at the price of
    additional lookups for the excess data:
    header      values    header    values    header       values        header     values




                                        excess varchar data




                                                © 2012 triAGENS GmbH | 2012-11-07            19
Row-oriented storage

   Datafiles are often organised in pages
   To read data for a specific row, the full page needs to be read
   The smaller the rows, the more will fit onto a page
    page                                 page
             values                               values                      values
    header              unused space     header                      header
           c0 c1 c2                             c0 c1 c2                    c0 c1 c2


   In reality, pages are often not fully filled up entirely
   This allows later update-in-place without page splits or movements
    (these are expensive operations) but may waste a lot of space
   Furthermore, rows must fit onto pages, leaving parts unused

                                         © 2012 triAGENS GmbH | 2012-11-07             20
Row-oriented storage

   Row-oriented storage is good if we need to touch one row.
    This normally requires reading/writing a single page
   Row-oriented storage is beneficial if all or most columns of a row
    need to be read or written. This can be done with a single
    read/write.
   Row-oriented storage is very inefficient if not all columns are
    needed but a lot of rows need to be read:
       Full rows are read, including columns not used by a query
       Reads are done page-wise. Not many rows may fit on a page when rows are
        big
       Pages are normally not fully filled, which leads to reading lots of unused areas
       Record (and sometimes page) headers need to be read, too but do not contain
        actual row data

                                               © 2012 triAGENS GmbH | 2012-11-07           21
Row-oriented storage

   Row-oriented storage is especially inefficient when only a small
    amout of columns is needed but the table has many columns
   Example: need only c1, but must read 2 pages to get just 3 values!
    page                                  page
             values                                values                      values
    header              unused space      header                      header
           c0 c1 c2                              c0 c1 c2                    c0 c1 c2


   We'd prefer getting 12 values with 1 read. This would be optimal:
    page
                   values
    c1 c1 c1 c1 c1 c1 c1 c1 c1 c1 c1 c1




                                          © 2012 triAGENS GmbH | 2012-11-07             22
Column-oriented storage

   Column-oriented databases primarily work on columns
   All columns are treated individually
   Values of a single column are stored contiguously
   This allows array-processing the values of a column
   Rows may be constructed from column values later if required
   This means column stores can still produce row output (tables)
   Values from multiple columns need to be retrieved and assembled
    for that, making implementation of bit more complex
   Query processors in columnar databases work on columns, too



                                      © 2012 triAGENS GmbH | 2012-11-07   23
Column-oriented storage

   Column stores can greatly improve the performance of queries that
    only touch a small amount of columns
   This is because they will only access these columns' particular data
   Simple math: table t has a total of 10 GB data, with
       column a: 4 GB
       column b: 2 GB
       column c: 3 GB
       column d: 1 GB
   If a query only uses column d, at most 1 GB of data will be
    processed by a column store
   In a row store, the full 10 GB will be processed

                                      © 2012 triAGENS GmbH | 2012-11-07   24
Column-oriented storage

   Column stores store data in column-specific files
   Simplest case: one datafile per column
   Row values for each column are stored contiguously


                                       column0 values
    column0
                 r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17

                                           filesize


                                       column1 values
    column1
                 r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17

                                           filesize


                                          © 2012 triAGENS GmbH | 2012-11-07      25
Column-oriented storage

   With every page read, we can read the values for many rows at
    once. We'll get more values per I/O than with row-storage
   This is ideal if we want to process most records anyway
   If values are fixed-length, we are also allowed random access to
    each record's value
   Saving header information in the values files is undesired.
    Can save meta information separately, e.g. NULL bits as a bit
    vector per column




                                     © 2012 triAGENS GmbH | 2012-11-07   26
Column-oriented storage

   All data within each column datafile have the same type,
    making it ideal for compression
   Usually a much better compression factor can be achieved for
    single columns than for entire rows
   Compression allows reducing disk I/O when reading/writing column
    data but has some CPU cost
   For data sets bigger than the memory size compression is often
    beneficial because disk access is slower than decompression




                                     © 2012 triAGENS GmbH | 2012-11-07   27
Column-oriented storage

   A good use case for compression in column stores is dictionary
    compression for variable length string values
   Each unique string is assigned an integer number
   The dictionary, consisting of integer number and string value, is
    saved as column meta data
   Column values are then integers only, making them small and fixed
    width
   This can save much space if string values are non-unique
   With dictionaries sorted by column value, this will also allow range
    queries



                                      © 2012 triAGENS GmbH | 2012-11-07    28
Column-oriented storage

   In a column database, it is relatively cheap to traverse
    all values of a column
   Building an index on a column only requires reading that column's
    data, not the complete table data as in a row store
   Adding or deleting columns is a relatively cheap operation, too




                                       © 2012 triAGENS GmbH | 2012-11-07   29
Column-oriented storage, segments

   Column data in column stores is ofted grouped into
    segments/packets of a specific size (e.g. 64 K values)
   Meta data is calculated and stored seperately per segment, e.g.:
       min value in segment
       max value in segment
       number of NOT NULL values in segment
       histograms
       compression meta data




                                          © 2012 triAGENS GmbH | 2012-11-07   30
Column-oriented storage, segments

   Segment meta data can be checked during query processing when
    no indexes are available
   Segment meta data may provide information about whether the
    segment can be skipped entirely, allowing to reduce the number
    of values that need to be processed in the query
   Calculating segment meta data is a relatively cheap operation
    (only needs to traverse column values in segment) but still should
    occur infrequently
   In a read-only or read-mostly workload this is tolerable




                                      © 2012 triAGENS GmbH | 2012-11-07   31
Column-oriented processing

   Column values are not processed row-at-a-time, but block-at-a-time
   This reduces the number of function calls (function call per block of
    values, but not per row)
   Operating in blocks allows compiler optimisations, e.g. loop
    unrolling, parallelisation, pipelining
   Column values are normally positioned in contiguous memory
    locations, also allowing SIMD operations (vectorisation)
   Working on many subsequent memory positions also improves
    cache usage (multiple values are in the same cache line) and
    reduces pipeline stalls
   All of the above make column stores ideal for batch processing


                                      © 2012 triAGENS GmbH | 2012-11-07   32
Column-oriented processing

   Reading all columns of a row is an expensive operation in a column
    store, so full row tuple construction is avoided or delayed as much
    as possible internally
   Updating or inserting rows may also be very expensive and may
    cost much more time than in a row store
   Some column stores are hybrids, with read-optimised (column)
    storage and write-optimised OLTP storage
   Still, column stores are not really made for OLTP workloads, and if
    you need to work with many columns at once, you'll pay a price in a
    column store




                                     © 2012 triAGENS GmbH | 2012-11-07   33
Column database products

   For a long time, there weren't many columnar databases around
   The topic re-emerged in 2006, when a paper about „C-Store“ was
    published. C-Store turned into „Vertica“ (commercial) later
   Now, there are many column stores around, but most are
    commercial
   Many vendors do not sell just the software, but also applicances
    with specialised hardware and setup
   Open source column stores include:
       MonetDB
       LucidDB (R.I.P.?)
       Infobright community edition (with reduced feature set, based on MySQL!!)


                                             © 2012 triAGENS GmbH | 2012-11-07      34

More Related Content

What's hot

Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Simplilearn
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon RedshiftKel Graham
 
Columnar Databases (1).pptx
Columnar Databases (1).pptxColumnar Databases (1).pptx
Columnar Databases (1).pptxssuser55cbdb
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 

What's hot (20)

Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Columnar Databases (1).pptx
Columnar Databases (1).pptxColumnar Databases (1).pptx
Columnar Databases (1).pptx
 
Graph based data models
Graph based data modelsGraph based data models
Graph based data models
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 
Document Database
Document DatabaseDocument Database
Document Database
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 

Similar to Introduction to column oriented databases

AvocadoDB query language (DRAFT!)
AvocadoDB query language (DRAFT!)AvocadoDB query language (DRAFT!)
AvocadoDB query language (DRAFT!)avocadodb
 
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...ArangoDB Database
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_DatabasesRick Perry
 
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptxRushikeshChikane2
 
The Digital Cavemen of Linked Lascaux
The Digital Cavemen of Linked LascauxThe Digital Cavemen of Linked Lascaux
The Digital Cavemen of Linked LascauxRuben Verborgh
 
Data Base Management System.pdf
Data Base Management System.pdfData Base Management System.pdf
Data Base Management System.pdfTENZING LHADON
 
A project on DBMS and RDMS
A project on DBMS and RDMSA project on DBMS and RDMS
A project on DBMS and RDMSAayush vohra
 

Similar to Introduction to column oriented databases (20)

AvocadoDB query language (DRAFT!)
AvocadoDB query language (DRAFT!)AvocadoDB query language (DRAFT!)
AvocadoDB query language (DRAFT!)
 
NoSQL - Leo's notes
NoSQL - Leo's notesNoSQL - Leo's notes
NoSQL - Leo's notes
 
Presentation1
Presentation1Presentation1
Presentation1
 
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
 
Artigo no sql x relational
Artigo no sql x relationalArtigo no sql x relational
Artigo no sql x relational
 
Nosql
NosqlNosql
Nosql
 
Nosql
NosqlNosql
Nosql
 
All About Database v1.1
All About Database  v1.1All About Database  v1.1
All About Database v1.1
 
Some NoSQL
Some NoSQLSome NoSQL
Some NoSQL
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_Databases
 
DB2 on Mainframe
DB2 on MainframeDB2 on Mainframe
DB2 on Mainframe
 
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
 
nosql.pptx
nosql.pptxnosql.pptx
nosql.pptx
 
Oslo baksia2014
Oslo baksia2014Oslo baksia2014
Oslo baksia2014
 
The Digital Cavemen of Linked Lascaux
The Digital Cavemen of Linked LascauxThe Digital Cavemen of Linked Lascaux
The Digital Cavemen of Linked Lascaux
 
WEB_DATABASE_chapter_4.pptx
WEB_DATABASE_chapter_4.pptxWEB_DATABASE_chapter_4.pptx
WEB_DATABASE_chapter_4.pptx
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
Data Base Management System.pdf
Data Base Management System.pdfData Base Management System.pdf
Data Base Management System.pdf
 
Ch10
Ch10Ch10
Ch10
 
A project on DBMS and RDMS
A project on DBMS and RDMSA project on DBMS and RDMS
A project on DBMS and RDMS
 

More from ArangoDB Database

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022ArangoDB Database
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB Database
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBArangoDB Database
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale ArangoDB Database
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDBArangoDB Database
 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisGetting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisArangoDB Database
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBArangoDB Database
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsArangoDB Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarArangoDB Database
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?ArangoDB Database
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoDB Database
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB Database
 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisWebinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisArangoDB Database
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB Database
 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBWebinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBArangoDB Database
 
An introduction to multi-model databases
An introduction to multi-model databasesAn introduction to multi-model databases
An introduction to multi-model databasesArangoDB Database
 
Running complex data queries in a distributed system
Running complex data queries in a distributed systemRunning complex data queries in a distributed system
Running complex data queries in a distributed systemArangoDB Database
 

More from ArangoDB Database (20)

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at Scale
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDB
 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisGetting Started with ArangoDB Oasis
Getting Started with ArangoDB Oasis
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDB
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at Scale
 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisWebinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB Oasis
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
 
3.5 webinar
3.5 webinar 3.5 webinar
3.5 webinar
 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBWebinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDB
 
An introduction to multi-model databases
An introduction to multi-model databasesAn introduction to multi-model databases
An introduction to multi-model databases
 
Running complex data queries in a distributed system
Running complex data queries in a distributed systemRunning complex data queries in a distributed system
Running complex data queries in a distributed system
 

Introduction to column oriented databases

  • 1. Column-oriented databases NoSQL Cologne UG 07. 11. 2012 Jan Steemann (triAGENS) twitter: @steemann © 2012 triAGENS GmbH | 2012-11-07 1
  • 2. Types of data stores Key / value stores (opaque / typed) Document stores (non-shaped / shaped) collection key value key document value key value key document ... ... Relational databases table key value row column column key row ... © 2012 triAGENS GmbH | 2012-11-07 2
  • 3. Key / value stores (opaque)  Keys are mapped to values  Values are treated as BLOBs (opaque data)  No type information is stored  Values can be heterogenous key value key value Example values:  { name: „foo“, age: 25, city: „bar“ } => JSON, but store will not care about it  xdexadxb0x0b => binary, but store will not care about it © 2012 triAGENS GmbH | 2012-11-07 3
  • 4. Key / value stores (typed)  Keys are mapped to values  Values have simple type information attached  Type information is stored per value  Values can still be heterogenous key type value key type value Example values:  number: 25 => numeric, store can do something with it  list: [ 1, 2, 3 ] => list, store can do something with it © 2012 triAGENS GmbH | 2012-11-07 4
  • 5. Document stores (non-shaped)  Keys are mapped to documents  Documents consist of attributes  Attributes are name/typed value pairs, which may be nested  Type information is stored per attribute  Documents can be heterogenous  Documents may be organised in collections or databases document key name type value name type value document key name type name type value name type value Example documents:  { name: „foo“, age: 25, city: „bar“ } => attributes and sub-attributes  { name: { first: „foo“, last: „bar“ }, age: 25 } are typed and can be indexed © 2012 triAGENS GmbH | 2012-11-07 5
  • 6. Document stores (shaped)  Same as document stores, but...  ...document type information is stored in shapes  ...documents with similar structure (attribute names and types) point to the same shape document shape key shape value value type name type name document shape key shape value value value type document key shape value value value © 2012 triAGENS GmbH | 2012-11-07 6
  • 7. Relational databases  Tables (relations) consist of rows and columns  Columns have a type. Type information is stored once per column  A rows contains just values for a record (no type information)  All rows in a table have the same columns and are homogenous table column type column type column type column type column type row key value value value value value row key value value value value value Example rows:  „foo“, „bar“, 25, 35.63  „bar“, „baz“, 42, -673.342 © 2012 triAGENS GmbH | 2012-11-07 7
  • 8. Data stores, summary  Documents in document stores can be heterogenous: Different documents can have different attributes and types Type information is stored per document or group of documents Homogenity Values in key / value stores can be heterogenous: Flexibility  Different values can have different types Type information is either not stored at all or stored per value  Rows in relational databases are homogenous: Different rows have the same columns and types Type information is stored once per column  Which data store type is ideal for the job depends on the types of queries that must be run on the data! © 2012 triAGENS GmbH | 2012-11-07 8
  • 9. Query types: OLTP  „transactional“ processing  retrieve or modify individual records (mostly few records)  use indexes to quickly find relevant records  queries often triggered by end user actions and should complete instantly  ACID properties may be important  mixed read/write workload  working set should fit in RAM © 2012 triAGENS GmbH | 2012-11-07 9
  • 10. Query types: OLAP  analytical processing / reporting  derive new information from existing data (aggregates, transformations, calculations)  queries often run on many records or complete data set  data set may exceed size of RAM easily  mainly read or even read-only workload  ACID properties often not important, data can often be regenerated  queries often run interactively  common: not known in advance which aspects are interesting so pre-indexing „relevant“ columns is difficult © 2012 triAGENS GmbH | 2012-11-07 10
  • 11. OLTP vs. OLAP  properties of OLTP and OLAP queries are very different  most data stores are specialised only for one class of queries  where to run which type of queries? OLTP Key / value store ??? Document store OLAP Relational database © 2012 triAGENS GmbH | 2012-11-07 11
  • 12. OLAP in relational databases  Data in relational databases is fully typed  Nested values can be simulated using auxilliary tables, joins etc.  Most relational databases offer SQL which allows complex analysis queries  Type overhead in relational databases is very low, allowing fast processing of data  Overhead for (often unneeded) ACID properties may produce a signifcant performance penalty © 2012 triAGENS GmbH | 2012-11-07 12
  • 13. Row vs. columnar relational databases  All relational databases deal with tables, rows, and columns  But there are sub-types:  row-oriented: they are internally organised around the handling of rows  columnar / column-oriented: these mainly work with columns  Both types usually offer SQL interfaces and produce tables (with rows and columns) as their result sets  Both types can generally solve the same queries  Both types have specific use cases that they're good for (and use cases that they're not good for) © 2012 triAGENS GmbH | 2012-11-07 13
  • 14. Row vs. columnar relational databases  In practice, row-oriented databases are often optimised and particuarly good for OLTP workloads  whereas column-oriented databases are often well-suited for OLAP workloads  this is due to the different internal designs of row- and column-oriented databases © 2012 triAGENS GmbH | 2012-11-07 14
  • 15. Row-oriented storage  In row-oriented databases, row value data is usually stored contiguously: row0 header column0 value column1 value column2 value column3 value row1 header column0 value column1 value column2 value column3 value row2 header column0 value column1 value column2 value column3 value (the row headers contain record lengths, NULL bits etc.) © 2012 triAGENS GmbH | 2012-11-07 15
  • 16. Row-oriented storage  When looking at a table's datafile, it could look like this: header values header values header values header values filesize  Actual row values are stored at specific offsets of the values struct: values values values values header header header header c0 c1 c2 c0 c1 c2 c0 c1 c2 c0 c1 c2  Offsets depend on column types, e.g. 4 for int32, 8 for int64 etc. © 2012 triAGENS GmbH | 2012-11-07 16
  • 17. Row-oriented storage  To read a specific row, we need to determine its position first  This is easy if rows have a fixed length:  position = row number * row length + header length  header length = 8  row length = header length (8) + value length (8) = 16  value length = c0 length (2) + c1 length (2) + c2 length (4) = 8 row 0 row 1 row 2 values values values header header header c0 c1 c2 c0 c1 c2 c0 c1 c2 8 2 2 4 8 2 2 4 8 2 2 4 16 16 16 © 2012 triAGENS GmbH | 2012-11-07 17
  • 18. Row-oriented storage  In reality its often not that easy  With varchars, values can have variable lengths header values header values header values filesize  With variable length values, we cannot simply jump in the middle of the datafile because we don't know where a row starts  To calculate the offset of a specifc row, we need to use the record lengths stored in the headers. This means traversing headers! © 2012 triAGENS GmbH | 2012-11-07 18
  • 19. Row-oriented storage  Another strategy is to split varchar values into two parts:  prefixes of fixed-length are stored in-row  excess data exceeding the prefix length are stored off-row, and are pointed to from the row  This allows values to have fixed lengths again, at the price of additional lookups for the excess data: header values header values header values header values excess varchar data © 2012 triAGENS GmbH | 2012-11-07 19
  • 20. Row-oriented storage  Datafiles are often organised in pages  To read data for a specific row, the full page needs to be read  The smaller the rows, the more will fit onto a page page page values values values header unused space header header c0 c1 c2 c0 c1 c2 c0 c1 c2  In reality, pages are often not fully filled up entirely  This allows later update-in-place without page splits or movements (these are expensive operations) but may waste a lot of space  Furthermore, rows must fit onto pages, leaving parts unused © 2012 triAGENS GmbH | 2012-11-07 20
  • 21. Row-oriented storage  Row-oriented storage is good if we need to touch one row. This normally requires reading/writing a single page  Row-oriented storage is beneficial if all or most columns of a row need to be read or written. This can be done with a single read/write.  Row-oriented storage is very inefficient if not all columns are needed but a lot of rows need to be read:  Full rows are read, including columns not used by a query  Reads are done page-wise. Not many rows may fit on a page when rows are big  Pages are normally not fully filled, which leads to reading lots of unused areas  Record (and sometimes page) headers need to be read, too but do not contain actual row data © 2012 triAGENS GmbH | 2012-11-07 21
  • 22. Row-oriented storage  Row-oriented storage is especially inefficient when only a small amout of columns is needed but the table has many columns  Example: need only c1, but must read 2 pages to get just 3 values! page page values values values header unused space header header c0 c1 c2 c0 c1 c2 c0 c1 c2  We'd prefer getting 12 values with 1 read. This would be optimal: page values c1 c1 c1 c1 c1 c1 c1 c1 c1 c1 c1 c1 © 2012 triAGENS GmbH | 2012-11-07 22
  • 23. Column-oriented storage  Column-oriented databases primarily work on columns  All columns are treated individually  Values of a single column are stored contiguously  This allows array-processing the values of a column  Rows may be constructed from column values later if required  This means column stores can still produce row output (tables)  Values from multiple columns need to be retrieved and assembled for that, making implementation of bit more complex  Query processors in columnar databases work on columns, too © 2012 triAGENS GmbH | 2012-11-07 23
  • 24. Column-oriented storage  Column stores can greatly improve the performance of queries that only touch a small amount of columns  This is because they will only access these columns' particular data  Simple math: table t has a total of 10 GB data, with  column a: 4 GB  column b: 2 GB  column c: 3 GB  column d: 1 GB  If a query only uses column d, at most 1 GB of data will be processed by a column store  In a row store, the full 10 GB will be processed © 2012 triAGENS GmbH | 2012-11-07 24
  • 25. Column-oriented storage  Column stores store data in column-specific files  Simplest case: one datafile per column  Row values for each column are stored contiguously column0 values column0 r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 filesize column1 values column1 r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 filesize © 2012 triAGENS GmbH | 2012-11-07 25
  • 26. Column-oriented storage  With every page read, we can read the values for many rows at once. We'll get more values per I/O than with row-storage  This is ideal if we want to process most records anyway  If values are fixed-length, we are also allowed random access to each record's value  Saving header information in the values files is undesired. Can save meta information separately, e.g. NULL bits as a bit vector per column © 2012 triAGENS GmbH | 2012-11-07 26
  • 27. Column-oriented storage  All data within each column datafile have the same type, making it ideal for compression  Usually a much better compression factor can be achieved for single columns than for entire rows  Compression allows reducing disk I/O when reading/writing column data but has some CPU cost  For data sets bigger than the memory size compression is often beneficial because disk access is slower than decompression © 2012 triAGENS GmbH | 2012-11-07 27
  • 28. Column-oriented storage  A good use case for compression in column stores is dictionary compression for variable length string values  Each unique string is assigned an integer number  The dictionary, consisting of integer number and string value, is saved as column meta data  Column values are then integers only, making them small and fixed width  This can save much space if string values are non-unique  With dictionaries sorted by column value, this will also allow range queries © 2012 triAGENS GmbH | 2012-11-07 28
  • 29. Column-oriented storage  In a column database, it is relatively cheap to traverse all values of a column  Building an index on a column only requires reading that column's data, not the complete table data as in a row store  Adding or deleting columns is a relatively cheap operation, too © 2012 triAGENS GmbH | 2012-11-07 29
  • 30. Column-oriented storage, segments  Column data in column stores is ofted grouped into segments/packets of a specific size (e.g. 64 K values)  Meta data is calculated and stored seperately per segment, e.g.:  min value in segment  max value in segment  number of NOT NULL values in segment  histograms  compression meta data © 2012 triAGENS GmbH | 2012-11-07 30
  • 31. Column-oriented storage, segments  Segment meta data can be checked during query processing when no indexes are available  Segment meta data may provide information about whether the segment can be skipped entirely, allowing to reduce the number of values that need to be processed in the query  Calculating segment meta data is a relatively cheap operation (only needs to traverse column values in segment) but still should occur infrequently  In a read-only or read-mostly workload this is tolerable © 2012 triAGENS GmbH | 2012-11-07 31
  • 32. Column-oriented processing  Column values are not processed row-at-a-time, but block-at-a-time  This reduces the number of function calls (function call per block of values, but not per row)  Operating in blocks allows compiler optimisations, e.g. loop unrolling, parallelisation, pipelining  Column values are normally positioned in contiguous memory locations, also allowing SIMD operations (vectorisation)  Working on many subsequent memory positions also improves cache usage (multiple values are in the same cache line) and reduces pipeline stalls  All of the above make column stores ideal for batch processing © 2012 triAGENS GmbH | 2012-11-07 32
  • 33. Column-oriented processing  Reading all columns of a row is an expensive operation in a column store, so full row tuple construction is avoided or delayed as much as possible internally  Updating or inserting rows may also be very expensive and may cost much more time than in a row store  Some column stores are hybrids, with read-optimised (column) storage and write-optimised OLTP storage  Still, column stores are not really made for OLTP workloads, and if you need to work with many columns at once, you'll pay a price in a column store © 2012 triAGENS GmbH | 2012-11-07 33
  • 34. Column database products  For a long time, there weren't many columnar databases around  The topic re-emerged in 2006, when a paper about „C-Store“ was published. C-Store turned into „Vertica“ (commercial) later  Now, there are many column stores around, but most are commercial  Many vendors do not sell just the software, but also applicances with specialised hardware and setup  Open source column stores include:  MonetDB  LucidDB (R.I.P.?)  Infobright community edition (with reduced feature set, based on MySQL!!) © 2012 triAGENS GmbH | 2012-11-07 34