Columnstore Indexes
Deep introduction into columnar storage and
indexes in SQL Server 2012

Denis Reznik
Sponsors
About me






3 |

Denis Reznik
Kiev, Ukraine
Database Architect at The Frayman Group
Microsoft MVP
Community enthus...
Agenda





Columnar storage
Creation of Columnstore index
Usage scenarios and limitations
Performance accelerators
 ...
Row Store and Column Store

 In row store, data is stored tuple by tuple.
 In column store, data is stored column by
col...
Row Store and Column Store


Most of the queries does not process all the attributes of a particular
relation.
nam addres...
Creating a columnstore index

T-SQL

SSMS
Usage scenarios and limitations
 Primary focus of Columnstore Indexes is DW
databases
 In SQL Server 2012 Columnstore In...
DEMO
 Incredible Performance of Columnstore
Indexes
How Are These Performance Gains
Achieved?
 Two complimentary technologies:
 Storage
 Data is stored in a compressed col...
Compression
 Patented VERTIPAQ algorithms
 So, there is no public information about how the
data actually compressed

 ...
DEMO
 Columnstore Indexes Internals
Columnar storage structure

Row store:

…
C1

Column store:

C2

C3

C4

C5

C6

Pages
Column Segments and Dictionaries
segment 1

C1

C2

C3

C4

C5

C6

Set of
about 1M
rows

…
segment N

Column
Segment

dic...
DEMO
 Columnstore Indexes – Segments and
Dictionaries
Memory management
• Memory management is automatic
• Columnstore is persisted on disk
• Needed columns fetched into memory...
Batch mode processing
Batch object

bitmap of qualifying rows

Column vectors

 Process ~1000 rows at a
time
 Vector ope...
Segment Elimination
• Segment (rowgroup) = 1 million row chunk
• Min, Max kept for each column in a segment
• Scans can sk...
DEMO
 Segment Elimination
Maintaining Data in a Columnstore Index
 Once built, the table becomes “read-only”
and INSERT/UPDATE/DELETE/MERGE is
no l...
Columnstore Index Future
 Actually it is already become 
 Columnstore indexes can be clustered (in
SQL server 2014)
 C...
Summary






Columnar storage
Columnstore Performance Demo
Creation of Columnstore index
Usage scenarios and limitat...
Sponsors
Thank you!
 Denis Reznik






Twitter: @denisreznik
Email: denisreznik@live.ru
Blog (in russian): http://reznik.une...
Upcoming SlideShare
Loading in...5
×

SqlSaturday199 - Columnstore Indexes

344

Published on

In-Memory features is the most perspective trend in the area of high performance. Columnstore Indexes is one of such features, and even with their restrictions, they can accelerate your queries at times! How to get more from this feature? In which situations should we use them? Which internal mechanisms help to achive that? You can get answers on these questions on this session.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
344
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

SqlSaturday199 - Columnstore Indexes

  1. 1. Columnstore Indexes Deep introduction into columnar storage and indexes in SQL Server 2012 Denis Reznik
  2. 2. Sponsors
  3. 3. About me      3 | Denis Reznik Kiev, Ukraine Database Architect at The Frayman Group Microsoft MVP Community enthusiast
  4. 4. Agenda     Columnar storage Creation of Columnstore index Usage scenarios and limitations Performance accelerators  Columnstore Storage internals  Columnstore Execution mode internals  Columnstore index maintenance  Columnstore Future (actually Present :) 4 |
  5. 5. Row Store and Column Store  In row store, data is stored tuple by tuple.  In column store, data is stored column by column
  6. 6. Row Store and Column Store  Most of the queries does not process all the attributes of a particular relation. nam address e id SELECT c.Name, c.Address FROM Customers c WHERE c.City = 'Sofia' city state age
  7. 7. Creating a columnstore index T-SQL SSMS
  8. 8. Usage scenarios and limitations  Primary focus of Columnstore Indexes is DW databases  In SQL Server 2012 Columnstore Indexes are read-only  Supported operators and data types are limited
  9. 9. DEMO  Incredible Performance of Columnstore Indexes
  10. 10. How Are These Performance Gains Achieved?  Two complimentary technologies:  Storage  Data is stored in a compressed columnar data format (stored by column) instead of row store format (stored by row).  New “batch mode” execution  Vector-based query execution capability  Data can then be processed in batches versus row-by-row  Depending on filtering and other factors, a query may also benefit by “segment elimination” - bypassing million row chunks (segments) of data, further reducing I/O
  11. 11. Compression  Patented VERTIPAQ algorithms  So, there is no public information about how the data actually compressed  But some info we have      Dictionary encoding Run Length encoding Bit-Vector encoding …
  12. 12. DEMO  Columnstore Indexes Internals
  13. 13. Columnar storage structure Row store: … C1 Column store: C2 C3 C4 C5 C6 Pages
  14. 14. Column Segments and Dictionaries segment 1 C1 C2 C3 C4 C5 C6 Set of about 1M rows … segment N Column Segment dictionaries
  15. 15. DEMO  Columnstore Indexes – Segments and Dictionaries
  16. 16. Memory management • Memory management is automatic • Columnstore is persisted on disk • Needed columns fetched into memory • Columnstore segments is a unit of data between disk and memory T.C T.C T.C T.C 4 1 3 T.C 2 T.C 1 T.C T.C T.C T.C 3 4 1 3 T.C 2 T.C 1 T.C T.C 3 T.C T.C 3 1 4 2 SELECT C2, SUM(C4) FROM T GROUP BY C2; T.C 2 T.C 2 T.C 4 T.C 4
  17. 17. Batch mode processing Batch object bitmap of qualifying rows Column vectors  Process ~1000 rows at a time  Vector operators implemented  Greatly reduced CPU time (7 to 40X)
  18. 18. Segment Elimination • Segment (rowgroup) = 1 million row chunk • Min, Max kept for each column in a segment • Scans can skip segments based on this info column_i d segment_i d min_data_i d max_data_id 1 1 20120101 20120131 1 2 20120115 20120215 1 3 20120201 20120228 skipped select Date, count(*) from dbo.Purchase where Date >= '20120201' group by Date
  19. 19. DEMO  Segment Elimination
  20. 20. Maintaining Data in a Columnstore Index  Once built, the table becomes “read-only” and INSERT/UPDATE/DELETE/MERGE is no longer allowed  ALTER INDEX REBUILD / REORGANIZE not allowed  How can I modify index data?  Drop columnstore index / make modifications / add columnstore index  UNION ALL (but be sure to validate performance)  Partition switches (IN and OUT)
  21. 21. Columnstore Index Future  Actually it is already become   Columnstore indexes can be clustered (in SQL server 2014)  Clustered Columnstore indexes can be updatable (in SQL Server 2014)  Update data (deltas) store in rowstore until segment can be created
  22. 22. Summary      Columnar storage Columnstore Performance Demo Creation of Columnstore index Usage scenarios and limitations Performance accelerators  Columnstore Storage internals  Columnstore Execution mode internals  Columnstore index maintanance  Columnstore Future (actually Present :) 22 |
  23. 23. Sponsors
  24. 24. Thank you!  Denis Reznik      Twitter: @denisreznik Email: denisreznik@live.ru Blog (in russian): http://reznik.uneta.com.ua Facebook: https://www.facebook.com/denis.reznik.5 LinkedIn: http://ua.linkedin.com/pub/denis-reznik/3/502/234
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×