Introduction to Columnstore Indexes
Jason
Strate
e: jstrate@pragmaticworks.com
e: jasonstrate@gmail.com
b: www.jasonstrate.com
t: StrateSQL
Resources jasonstrate.com/go/indexing
Introduction
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
• Industry leaders in Microsoft BI and SQL Server
Platform
• SQL Server Professionals - PASS Board of Directors,
Speakers, Authors and MVP’s
• National Sales Team Divided by Microsoft
Territories
• National System Integrator (NSI)
• Gold Certified in Business Intelligence and Data
Platform
• Platform Modernization/Safe Passage
• Premier Partner for PDW SI Partner Program
MS PDW Partner of Year FY13
Frontline Partnership Partner of the Year for Big Data
Executive sponsor - Andy Mouacdie, WW sales director
PDW
• Over 7,200 customers worldwide
• Over 186,000 people in PW database for demand
generation
About Pragmatic Works
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Everyone wants
fast queries
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Sometimes Normal
Doesn’t Make Sense
We need to change
what we do
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
To get things
back to normal
Agenda
Introduction Columnstore Basics
SQL Server
Columnstore
Non-Clustered
Columnstore
Clustered
Columnstore
Summary
Agenda
Introduction Columnstore Basics
SQL Server
Columnstore
Non-Clustered
Columnstore
Clustered
Columnstore
Summary
Session Goals
• Identify differences between rowstore and
columnstore indexes
• Describe implementation of columnstore index
in SQL Server
• Define pros and cons with columnstore indexes
• Demonstrate use of columnstore indexes
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Agenda
Introduction Columnstore Basics
SQL Server
Columnstore
Non-Clustered
Columnstore
Clustered
Columnstore
Summary
Index Problems
• Indexes traditionally are row-based
– All columns for row stored together
• Any column read brings over all columns
– Impacts IO, CPU, and memory
– Typically 15% of DW columns in queries
• Databases growing 10x every 5 years
• Tuning for over indexing, under indexing, and
bookmark lookups
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Columnstore Basics
• Change data storage from row to column
based
– Rowstores
– Columnstores
• Data grouped by columns
– One column per grouping (segment)
– Data access at columns level
– Only return columns required for query
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Say
What?!
Traditional Index
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Columnstore Index
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Columnstore Index
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Why Columnstore?
• Improved query performance
– Column centric reads
– Heavy data compression
– Processing by batch versus row
• Improved resource utilization
– Smaller index space
– Less memory required
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Agenda
Introduction Columnstore Basics
SQL Server
Columnstore
Non-Clustered
Columnstore
Clustered
Columnstore
Summary
SQL Server Columnstore
• Introduced to improve data warehouse
performance
• Available since SQL Server 2012
• SQL Server 2012
– Non-clustered columnstore
• SQL Server 2014
– Clustered columnstore
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Columnstore Attributes
• Introduced in SQL Server 2012
• Supports batch mode processing
• SQL Server 2012
– Non-clustered columnstore
– No key columns
– Support partitioning
– Read only index
• SQL Server 2014
– Clustered columnstore
– Columnstore archive format
– Readwrite when clustered
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Batch-mode Processing
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
1 record versus 415 records
Columnstore Restrictions
• Cannot be a unique index.
• Cannot be created on a view or
indexed view.
• Cannot include a sparse
column.
• Cannot act as a primary key or
a foreign key.
• Cannot be created with
the INCLUDE keyword
• Cannot include
the ASC or DESC keywords for
sorting the index
• No seek operations
• Cannot contain a column with a
FILESTREAM attribute
• Limited data types
– Except varchar(max),
nvarchar(max), binary,
varbinary, ntext, text, image,
uniqueidentifier, rowversion,
timestamp, sqlvariant, xml, CLR
types
• Feature restrictions
– Replication
– Change tracking
– Change data capture
– Filestream
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Agenda
Introduction Columnstore Basics
SQL Server
Columnstore
Non-Clustered
Columnstore
Clustered
Columnstore
Summary
Non-Clustered Columnstore
• Available in:
– SQL Server 2012
– SQL Server 2012 Parallel Data Warehouse
– SQL Server 2014
• One non-clustered columnstore per table
– Support tables heap or clustered index
• Add in all columns from table
– No key columns
• Best support star-join style queries
• Avoid OUTER JOIN and NOT IN
– Prevent batch processing
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Updating Non-Clustered Columnstore
• Drop the columnstore
• Perform DML
• Recreate columnstore
Drop and
Rebuild
• Create new partition with modified data
• Create columnstore
• Switch in partition
Partition
Switch
• Apply changes to separate table
• UNION ALL columnstore or change table
• Manage DML through trigger
Partitioned
View
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
NON-CLUSTERED COLUMNSTORE
Demo
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Agenda
Introduction Columnstore Basics
SQL Server
Columnstore
Non-Clustered
Columnstore
Clustered
Columnstore
Summary
Clustered Columnstore
• Available in:
– SQL Server 2012 Parallel Data Warehouse
– SQL Server 2014
• One clustered columnstore per table
– Is primary storage for data
– Reduces space requirements
• Can leverage columnstore archive format
• Same design patterns as non-clustered
• Preferred over non-clustered columnstore
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
CLUSTERED COLUMNSTORE
Demo
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Agenda
Introduction Columnstore Basics
SQL Server
Columnstore
Non-Clustered
Columnstore
Clustered
Columnstore
Summary
Summary
• Columnstore indexes change data can be
accessed
• Provides increased data compression
• Retrieves only the columns that are needed
• Primary benefit in data warehouses
– Slowly or unchanging data
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
More Information
Expert Performance Indexing
For SQL Server 2012
Jason Strate
Ted Krueger
Overview
Statistics
Maintenance
Tools
Analysis
http://amzn.com/1430237414
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
Services
Speed development through training, and
rapid development services from
Pragmatic Works.
Products
BI products to covert to a Microsoft BI
platform and simplify development on
the platform.
Foundation
Helping those who do not have the
means to get into information technology
achieve their dreams.
For more information…
Name: Jason Strate
Email: jstrate@pragmaticworks.com
Blog: www.jasonstrate.com
Resource: jasonstrate.com/go/indexing
Need Help? jasonstrate.com/go/vmdba

Introduction to Columnstore Indexes

  • 1.
  • 2.
    Jason Strate e: jstrate@pragmaticworks.com e: jasonstrate@gmail.com b:www.jasonstrate.com t: StrateSQL Resources jasonstrate.com/go/indexing Introduction MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 3.
    MAKING BUSINESS INTELLIGENT www.pragmaticworks.com •Industry leaders in Microsoft BI and SQL Server Platform • SQL Server Professionals - PASS Board of Directors, Speakers, Authors and MVP’s • National Sales Team Divided by Microsoft Territories • National System Integrator (NSI) • Gold Certified in Business Intelligence and Data Platform • Platform Modernization/Safe Passage • Premier Partner for PDW SI Partner Program MS PDW Partner of Year FY13 Frontline Partnership Partner of the Year for Big Data Executive sponsor - Andy Mouacdie, WW sales director PDW • Over 7,200 customers worldwide • Over 186,000 people in PW database for demand generation About Pragmatic Works
  • 4.
  • 5.
  • 6.
    We need tochange what we do
  • 7.
  • 8.
    Agenda Introduction Columnstore Basics SQLServer Columnstore Non-Clustered Columnstore Clustered Columnstore Summary
  • 9.
    Agenda Introduction Columnstore Basics SQLServer Columnstore Non-Clustered Columnstore Clustered Columnstore Summary
  • 10.
    Session Goals • Identifydifferences between rowstore and columnstore indexes • Describe implementation of columnstore index in SQL Server • Define pros and cons with columnstore indexes • Demonstrate use of columnstore indexes MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 11.
    Agenda Introduction Columnstore Basics SQLServer Columnstore Non-Clustered Columnstore Clustered Columnstore Summary
  • 12.
    Index Problems • Indexestraditionally are row-based – All columns for row stored together • Any column read brings over all columns – Impacts IO, CPU, and memory – Typically 15% of DW columns in queries • Databases growing 10x every 5 years • Tuning for over indexing, under indexing, and bookmark lookups MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 13.
    Columnstore Basics • Changedata storage from row to column based – Rowstores – Columnstores • Data grouped by columns – One column per grouping (segment) – Data access at columns level – Only return columns required for query MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 14.
  • 15.
    Traditional Index MAKING BUSINESSINTELLIGENT www.pragmaticworks.com
  • 16.
    Columnstore Index MAKING BUSINESSINTELLIGENT www.pragmaticworks.com
  • 17.
    Columnstore Index MAKING BUSINESSINTELLIGENT www.pragmaticworks.com
  • 18.
    Why Columnstore? • Improvedquery performance – Column centric reads – Heavy data compression – Processing by batch versus row • Improved resource utilization – Smaller index space – Less memory required MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 19.
    Agenda Introduction Columnstore Basics SQLServer Columnstore Non-Clustered Columnstore Clustered Columnstore Summary
  • 20.
    SQL Server Columnstore •Introduced to improve data warehouse performance • Available since SQL Server 2012 • SQL Server 2012 – Non-clustered columnstore • SQL Server 2014 – Clustered columnstore MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 21.
    Columnstore Attributes • Introducedin SQL Server 2012 • Supports batch mode processing • SQL Server 2012 – Non-clustered columnstore – No key columns – Support partitioning – Read only index • SQL Server 2014 – Clustered columnstore – Columnstore archive format – Readwrite when clustered MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 22.
    Batch-mode Processing MAKING BUSINESSINTELLIGENT www.pragmaticworks.com 1 record versus 415 records
  • 23.
    Columnstore Restrictions • Cannotbe a unique index. • Cannot be created on a view or indexed view. • Cannot include a sparse column. • Cannot act as a primary key or a foreign key. • Cannot be created with the INCLUDE keyword • Cannot include the ASC or DESC keywords for sorting the index • No seek operations • Cannot contain a column with a FILESTREAM attribute • Limited data types – Except varchar(max), nvarchar(max), binary, varbinary, ntext, text, image, uniqueidentifier, rowversion, timestamp, sqlvariant, xml, CLR types • Feature restrictions – Replication – Change tracking – Change data capture – Filestream MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 24.
    Agenda Introduction Columnstore Basics SQLServer Columnstore Non-Clustered Columnstore Clustered Columnstore Summary
  • 25.
    Non-Clustered Columnstore • Availablein: – SQL Server 2012 – SQL Server 2012 Parallel Data Warehouse – SQL Server 2014 • One non-clustered columnstore per table – Support tables heap or clustered index • Add in all columns from table – No key columns • Best support star-join style queries • Avoid OUTER JOIN and NOT IN – Prevent batch processing MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 26.
    Updating Non-Clustered Columnstore •Drop the columnstore • Perform DML • Recreate columnstore Drop and Rebuild • Create new partition with modified data • Create columnstore • Switch in partition Partition Switch • Apply changes to separate table • UNION ALL columnstore or change table • Manage DML through trigger Partitioned View MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 27.
    NON-CLUSTERED COLUMNSTORE Demo MAKING BUSINESSINTELLIGENT www.pragmaticworks.com
  • 28.
    Agenda Introduction Columnstore Basics SQLServer Columnstore Non-Clustered Columnstore Clustered Columnstore Summary
  • 29.
    Clustered Columnstore • Availablein: – SQL Server 2012 Parallel Data Warehouse – SQL Server 2014 • One clustered columnstore per table – Is primary storage for data – Reduces space requirements • Can leverage columnstore archive format • Same design patterns as non-clustered • Preferred over non-clustered columnstore MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 30.
    CLUSTERED COLUMNSTORE Demo MAKING BUSINESSINTELLIGENT www.pragmaticworks.com
  • 31.
    Agenda Introduction Columnstore Basics SQLServer Columnstore Non-Clustered Columnstore Clustered Columnstore Summary
  • 32.
    Summary • Columnstore indexeschange data can be accessed • Provides increased data compression • Retrieves only the columns that are needed • Primary benefit in data warehouses – Slowly or unchanging data MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 33.
    More Information Expert PerformanceIndexing For SQL Server 2012 Jason Strate Ted Krueger Overview Statistics Maintenance Tools Analysis http://amzn.com/1430237414 MAKING BUSINESS INTELLIGENT www.pragmaticworks.com
  • 34.
    Services Speed development throughtraining, and rapid development services from Pragmatic Works. Products BI products to covert to a Microsoft BI platform and simplify development on the platform. Foundation Helping those who do not have the means to get into information technology achieve their dreams. For more information… Name: Jason Strate Email: jstrate@pragmaticworks.com Blog: www.jasonstrate.com Resource: jasonstrate.com/go/indexing Need Help? jasonstrate.com/go/vmdba