Column orientation - rotate your thinking 90 degrees

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

1 comments

Comments 1 - 1 of 1 previous next Post a comment

  • + lemire Daniel Lemire 4 weeks ago
    Good work. Note that column orientation goes back to the seventies.
Post a comment
Embed Video
Edit your comment Cancel

1 Favorite

Column orientation - rotate your thinking 90 degrees - Presentation Transcript

  1. Column based Databases Shweta Agrawal    
  2. What is a column based DB? ID NAME SEX AGE SALARY ADDRRESS PHONE PAN... 1 Sunil Sharma M 40 10,000 ... ... ... 2 Neha Agarwal F 25 12,000 ... ... ... 3 Anant Agarwal M 28 15,000 ... ... ... 4 Vishal Mehta M 30 8,000 ... ... ... One page of the table storage 1|Shweta Agrawal|M| 1|2|3|4...|Shweta  40|10000...|2|Neha  Agrawal|Neha Agrawal| Agrawal|F|25| Anant Agarwal|Vishal  12000...|3|Anant  Mehta...|M|F|M|M...| Agarwal|M|28| 40|25|28|30...|10000| 15000...|4|Vishal  12000|15000|8000... Mehta|M|30|8000... Row based storage Column based storage    
  3. Column stores 1|2|3|4|5| Shweta  M|F|M|M|M|F| 40|25|28|30| 10000|12000| 6|.... Agrawal|Neha  F... 45|20... 15000|8000| Agrawal| 15000| Anant  5000... Agarwal| Vishal  ... Mehta| Srinivas  Pathak| Rubina  Mehta.... 1st page of each column store    
  4. Query processing on row store SELECT name, salary FROM employee WHERE age > 40  Evaluate condition age>40 possibly using an index  on age.  Get a found­set containing row number/ID of rows  that satisfy above condition.  Retrieve all rows in the above  found­set.  Send only name, and salary from the rows as result  to client    
  5. Query processing on  a column  store SELECT name, salary FROM employee WHERE age > 40  Evaluate condition age > 40 on column age, using an  index if present  Get a found­set containing row number/ID of rows that  satisfy above condition  Retrieve name's from name's column store for all rows in  the found­set  Retrieve salary's from salary column for all rows in the  found­set  Associate name with salary by row id/number for final  result    
  6. A quick calculation of IO  Table has 10 columns  1 million rows.  Each row is 100 bytes   30% of employees are above age 40  Total amount of data read in row based store =  100MB * 0.3 = 30MB  Total amount of data read in column based  store 100MB * 0.3 * 0.2 (only 2 columns) = 6MB    
  7. Why is it important?  Wide fact tables in data­warehouses  Analytics queries on data­warehouse tend to  aggregate/analyse a few columns but a large  number of rows.  Full table scans for analytics queries in row  stores  Normalization means more joins    
  8. An example star schema    
  9. Benefits of column based DB  Low pages read = Less IO = faster queries  Processes CPU bound instead of IO bound  Compression  Page level compression  Column level compression (lookup tables)  Natural intra­query parallelism on conditions on  different columns    
  10. Row based equivalents  Index every column?  Maintenance: updates/insert/deletes  Storage  Most importantly: Index is value=>id, column is  id=>value  Useful for selective queries only    
  11. Row based equivalents  Vertical partitioning?  Joins (although fast ones)  Table overhead  Cannot use horizontal partitioning  Row based query engine not geared up to make  use of the column based storage.    
  12. Summary  For adhoc analytics queries, column based  storage reduces IO, and makes queries faster  Column based query engines written ground up  for analytics queries make good use of this  storage.  Indexing every column, or vertical partioning not  same as column based storage.    
  13. References  Commercial products  Sybase IQ  Vertica  MySQL's InfoBright storage engine  To know more, read  http://databasecolumn.vertica.com/    
SlideShare Zeitgeist 2009

+ ashwetaashweta Nominate

custom

255 views, 1 favs, 0 embeds more stats

With ever increasing data and greater analytics req more

More info about this document

© All Rights Reserved

Go to text version

  • Total Views 255
    • 255 on SlideShare
    • 0 from embeds
  • Comments 1
  • Favorites 1
  • Downloads 6
Most viewed embeds

more

All embeds

less

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel
File a copyright complaint
Having problems? Go to our helpdesk?

Categories