Sql Server 2008 New Programmability Features


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Sql Server 2008 New Programmability Features

  1. 1. New Programmability Features<br />
  2. 2. Valinor<br />SQL Server Professional Services<br />Database Projects end-to-end<br />High Availability & Disaster Recovery<br />Upgrades<br />Performance Analysis & Tuning<br />Security<br />Training<br />contact@valinor.co.il<br />
  3. 3. Valinor<br />SQL Server Complementary Tools<br /><ul><li>SQL diagnostic manager
  4. 4. SQLcompliance manager
  5. 5. SQLsafe backup
  6. 6. SQLchange manager
  7. 7. SQLdefrag manager
  8. 8. SQL comparison toolset
  9. 9. SQLadmin toolset</li></li></ul><li>Valinor<br />http://www.sqlserver.co.il<br />
  10. 10. Agenda<br />T-SQL enhancements<br />MERGE Statement<br />Table Valued Parameters<br />Grouping Sets<br />Compound Assignments<br />Table value constructors<br />New data types<br />Filestream<br />HierarchyID<br />Temporal data types<br />Spatial data types<br />SSMS Enhancements<br />Intellisense<br />Multi-instances query<br />Registered Servers properties<br />SSRS Enhancements<br />More new features<br />Data compression<br />Resource Governor<br />
  11. 11. T-SQL Enhancements<br />MERGE Statement<br />Table Valued Parameters<br />Grouping Sets<br />Compound Assignments<br />Table value constructors<br />
  12. 12. MERGE Statement<br />What is it?<br />MERGE allows you to compare two tables and apply changes to one of them according to matching and non-matching rows<br />Allows multiple set operations in a single SQL statement<br />Operations can be INSERT, UPDATE, DELETE<br />ANSI SQL 2006 compliant - with extensions<br />
  13. 13. What can I do with it?<br />ETL processes<br />“Update if exists, insert otherwise” stored procedures<br />Data comparison, verification and modification<br />MERGE Statement<br />
  14. 14. MERGE Syntax<br />MERGE [INTO] target_table AS target<br />USING source_table AS source (table, view, derived table)<br /> ON &lt;merge_search_conditions&gt;<br />WHEN MATCHED [AND &lt;other predicates&gt;]<br /> UPDATE SET target.col2 = source.col2 (or delete)<br />WHEN NOT MATCHED [BY TARGET] [AND &lt;other predicates&gt;]<br /> INSERT [(col_list)] VALUES (col_list)<br />WHEN NOT MATCHED BY SOURCE [AND &lt;other predicates&gt;]<br /> DELETE (or update)<br />OUTPUT $action, inserted.col, deleted.col, source.col;<br /> Get used to semicolons!<br />
  15. 15. MERGE Statement<br />$action function in OUTPUT clause<br />Multiple WHEN clauses possible <br />For MATCHED and NOT MATCHED BY SOURCE<br />Only one WHEN clause for NOT MATCHED<br />Rows affected includes total rows affected by all clauses<br />
  16. 16. MERGE Performance<br />MERGE statement is transactional<br />No explicit transaction required<br />One pass through tables<br />At most a full outer join<br />Matching rows (inner join) = when matched<br />Left-outer join rows = when not matched<br />Right-outer join rows = when not matched by source<br />When optimizing, optimize joins<br />Index columns used in ON clause (if possible, unique index on source table)<br />
  17. 17. MERGE and Determinism<br />UPDATE using a JOIN is non-deterministic<br />If more than one row in source matches ON clause, either/any row can be used for the UPDATE.<br />MERGE is deterministic<br />If more than one row in source matches ON clause, an exception is thrown.<br />
  18. 18. MERGE & Triggers<br />When there are triggers on the target table<br />Trigger is raised per DML (insert/update/delete)<br />“Inserted” and “deleted” tables are populated<br />In much the same way, MERGE is treated by replication as a series of insert, update and delete commands.<br />
  19. 19. Where is MERGE useful?<br />Replace UPDATE … JOIN and DELETE … JOIN statements<br />ETL Processes<br />Set comparison<br />Update-if-exists, insert-otherwise procedures<br />IF EXISTS (SELECT … FROM tbl)<br /> UPDATE tbl …<br />ELSE <br /> INSERT …<br />
  20. 20. MERGE Statement<br /><ul><li> MERGE usage</li></li></ul><li>Table-Valued Parameters<br />Before SQL Server 2008<br />In order to pass a set of data to SQL Server:<br />Comma-delimited strings<br />XML<br />BULK INSERT<br />Why would I want to do that?<br />Less database round-trip<br />“Array” variable<br />Example: Pass an order + its line details<br />Table-valued parameters solve this problem<br />
  21. 21. Table Types<br />SQL Server has table variables<br /> DECLARE @t TABLE (id int);<br />SQL Server 2008 adds strongly typed table variables<br />CREATE TYPE mytab AS TABLE (id int);<br />DECLARE @t mytab;<br />Parameters must use strongly typed table variables <br />
  22. 22. Table Variables are Input Only<br />Declare and initialize TABLE variable<br />DECLARE @t mytab;<br />INSERT @t VALUES (1), (2), (3);<br />EXEC myproc @t;<br />Parameter must be declared READONLY<br />CREATE PROCEDURE usetable<br /> ( @t mytabREADONLY ...)<br />AS<br /> INSERT INTO lineitems SELECT * FROM @t;<br /> UPDATE @t SET... -- no!<br />
  23. 23. TVP Implementation and Performance<br />Table Variables materialized in TEMPDB<br />Faster than parameter arrays, BCP APIs still fastest<br />Duration (ms)<br />Number of rows passed<br />
  24. 24. Table-Valued Parameters<br /><ul><li> Table-Valued Parameters end-to-end</li></li></ul><li>Grouping Sets<br />GROUP BY<br />Used to group rows by values in specified columns<br />Aggregate data <br />sum(), count(*), avg(), min(), max()…)<br />Grouping Sets<br />Why limit the query to a single grouping?<br />
  25. 25. Grouping Sets<br />GROUP BY Extensions<br />GROUPING SETS<br />Specifies multiple groupings of data in one query<br />CUBE *<br />All permutations in column set<br />ROLLUP*<br />Sub totals, super aggregates and grand total.<br />* don’t confuse with old nonstandard WITH CUBE and WITH ROLLUP options<br />
  26. 26. Grouping_ID()<br />New function that computes the level of grouping<br />Takes a set of columns as parameters and returns the grouping level<br />Helps identify the grouping set that each result row belongs to.<br />GROUPING(col) function returns 1 bit: 1 if the result is grouped by the column, 0 otherwise.<br />
  27. 27. Grouping Sets<br /><ul><li>How to use Grouping Sets</li></li></ul><li>Variable Initialization & Compound Assignment<br />T-SQL#<br />Small programming enhancements targeting more convenient and efficient development<br />Variable Initialization<br />DECLARE @i AS INT = 0<br /> , @d AS DATETIME = CURRENT_TIMESTAMP<br /> , @j AS INT = (SELECT COUNT(*) FROM sysobjects);<br />select @i,@d, @j;<br />
  28. 28. Table Value Constructors<br />Use the VALUES clause to construct a set of rows to insert multiple rows or as a derived tables<br />INSERT INTO dbo.Customers(custid, companyname, phone, address)<br />  VALUES<br />  (1, &apos;cust 1&apos;, &apos;(111) 111-1111&apos;, &apos;address 1&apos;),<br />  (2, &apos;cust 2&apos;, &apos;(222) 222-2222&apos;, &apos;address 2&apos;),<br />  (3, &apos;cust 3&apos;, &apos;(333) 333-3333&apos;, &apos;address 3&apos;);<br />SELECT * FROM<br />( VALUES <br /> (CAST(&apos;20090730&apos; AS DATE),&apos;TishaB&apos;&apos;Av&apos;)<br /> ,(CAST(&apos;20090918&apos; AS DATE),&apos;Erev Rosh HaShana&apos;)<br /> ,(CAST(&apos;20090919&apos; AS DATE),&apos;Rosh HaShana&apos;)<br /> ) AS holidays (HolidayDate,description)<br />
  29. 29. FILESTREAM Storage<br />To Blob Or Not To Blob?<br />Cons<br /> LOBS take memory buffers<br /> Updating LOBS causes fragmentation<br /> Poor streaming capabalities<br />Pros<br /> Transactional consistency<br /> Point-in-time backup & restore<br /> Single storage and query vehicle<br />?<br />
  30. 30. Dedicated BLOB Store<br />Store BLOBs in Database<br />Use File Servers<br />Application<br />Application<br />Application<br />BLOBs<br />BLOBs<br />BLOBs<br />DB<br />DB<br />DB<br /><ul><li>Streaming Performance
  31. 31. Integrated management
  32. 32. Data-level consistency
  33. 33. Enterprise-scales only
  34. 34. Scalability & Expandability</li></ul>Advantages<br /><ul><li>Complex application development & deployment
  35. 35. Separate data management
  36. 36. Integration with structured data
  37. 37. Poor data streaming support
  38. 38. File size limitations
  39. 39. Affects performance of structured data querying
  40. 40. Complex application development & deployment
  41. 41. Separate data management
  42. 42. Enterprise-scales only</li></ul>Challenges<br /><ul><li>Windows File Servers
  43. 43. NetAppNetFiler
  44. 44. EMC Centera
  45. 45. Fujitsu Nearline
  46. 46. SQL Server VARBINARY(MAX)</li></ul>Example <br />Blob Storage Options<br />
  47. 47. FILESTREAM combines the best of 2 worlds<br /><ul><li> Integrates DB engine with NTFS
  48. 48. Stores BLOB data as files</li></ul>FILESTREAM Storage<br />Application<br />BLOBs<br />FEATURES:<br /><ul><li> Uses NT cache for caching file data.
  49. 49. SQL buffer pool is not used and is available to query processing
  50. 50. Windows file system interface provides streaming access to data
  51. 51. Compressed volumes are supported
  52. 52. File access is part of DB transaction.</li></ul>DB<br />
  53. 53. It’s not only about storing but also about working with BLOBS: <br /><ul><li>Image analysis
  54. 54. Voice interpretation
  55. 55. Mixing satellite feeds & Spatial Data type for weather reports
  56. 56. and more…</li></ul>FILESTREAM Storage<br />
  57. 57. FILESTREAM Programming<br />Dual Programming Model<br />TSQL (Same as SQL BLOB)<br />Win32 Streaming File IO APIs<br />Begin a SQL Server Transaction<br />Obtain a symbolic PATH NAME & TRANSACTION CONTEXT<br />Open a handle using sqlncli10.dll - OpenSqlFilestream<br />Use Handle Within System.IO Classes<br />Commit Transaction<br />
  58. 58. // 1. Start up a database transaction – <br />SqlTransactiontxn = cxn.BeginTransaction();<br />// 2. Insert a row to create a handle for streaming.<br />newSqlCommand(&quot;INSERT &lt;Table&gt; VALUES ( @mediaId, @fileName, @contentType);&quot;,cxn, txn);<br />// 3. Get a filestreamPathName & transaction context.<br />newSqlCommand(&quot;SELECT PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT() FROM &lt;Table&gt;&quot;, cxn, txn);<br />// 4. Get a Win32 file handle using SQL Native Client call.<br />SafeFileHandle handle = SqlNativeClient.OpenSqlFilestream(...);<br />// 5. Open up a new stream to write the file to the blob.<br />FileStreamdestBlob= newFileStream(handle, FileAccess.Write);<br />// 6. Loop through source file and write to FileStream handle<br />while ((bytesRead = sourceFile.Read(buffer, 0, buffer.Length)) &gt; 0) {destBlob.Write(buffer, 0, bytesRead);}<br />// 7. Commit transaction, cleanup connection. –<br />txn.Commit();<br />FILESTREAM Programming<br />
  59. 59. FILESTREAM Implementation<br />Server & Instance level<br />Enable filestream (in setup or configuration manager)<br />Make sure port 445 (SMB) is open if remote access is used<br />Exec sp_configure&apos;filestream_access_level&apos;, [0/1/2]<br />At Database Level<br />Create a filestreamfilegroup & map to directory<br />At Table Level<br />Define VARBINARY(MAX) FILESTREAM column(s)<br />Must have UNIQUEIDENTIFIER column (and file extension column if FTS is used)<br />
  60. 60. FILESTREAM Implementation<br />Integrated security<br />ACLs (NT permissions) granted only to SQL Server service account.<br />Permissions to access files implied by granting read/write permissions on FILESTREAM column in SQL Server.<br />Naturally, only Windows authentication is supported.<br />Integrated management<br />BACKUP and RESTORE (for both database and log) also backup and restore FILESTREAM data.<br />Note that in the FULL RECOVERY MODEL, “deleted” files are not deleted until log is backed up.<br />
  61. 61. FILESTREAM Limitations<br />Not supported <br />Remote FILESTREAM storage<br />Database snapshot and Mirroring<br />Supported:<br />Replication (with SQL Server 2008 subscribers)<br />Log shipping (with SQL Server 2008 secondaries)<br />SQL Server Express Edition (4gb size limit does not apply to FILESTREAM data container)<br />Features not integrated<br />SQL Encryption<br />Table Value Parameters<br />
  62. 62. HierarchyID<br />
  63. 63. HierarchyID<br />When should I use it?<br /><ul><li>List forum threads
  64. 64. Business organization charts
  65. 65. Product categories
  66. 66. Files/Folders management
  67. 67. Anything hierarchical</li></ul>Features:<br /><ul><li> Compact - 100,000 nodes in 6 levels ~ 5 bytes / node
  68. 68. Available to CLR clients as the Sqlhierarchyid data type</li></li></ul><li>Each node holds its parent’s name/ID<br />Pros:<br /><ul><li>Understandable
  69. 69. Managable
  70. 70. 2005 – CTEs can beused for recursive queries</li></ul>Cons:<br /><ul><li> Queries are more complex
  71. 71. Bad performance with large trees.</li></ul>2005 Alternatives - Adjacency Model<br />
  72. 72. 2005 Alternatives – Path Enumeration<br />Each node holds a path to the root, as a string concatenation<br />Pros:<br /><ul><li> Logical representation
  73. 73. Easy tree traversal</li></ul>Cons: <br /><ul><li> Difficult to maintain
  74. 74. Searches are done with string functions (LIKE, string split)</li></li></ul><li>&quot;Left“ and &quot;Right“ columns represent edges <br />Pros:<br /><ul><li> Easy to query with >, <, BETWEEN
  75. 75. Easy to index</li></ul>Cons:<br /><ul><li> Difficult to maintain</li></ul>(try to add another underboss…)<br />2005 Alternatives – Nested Sets<br />
  76. 76. HierarchyID<br /> New data type in SQL Server 2008<br /> Uses path enumeration, but in a much more efficient binary representation.<br /> Exposes methods to query the tree and set relations between nodes.<br /> Remember that it’s just data<br /> It’s up to the application to assign correct hierarchyID values to the nodes to represent the relations (using HierarchyID methods)<br /> It’s up to the developer/DBA to create a unique constraint on the hierarchyID column<br />
  77. 77. HierarchyID<br />Nodes are ordered<br />Root: /<br />First child: /1/<br />Second Child: /2/<br />First grandchild: /1/1/<br />Second: /1/2/<br />If you want to insert another grandchild between them - /1/1.1/<br />
  78. 78. HierarchyID Methods<br />hierarchyid::GetRoot() – get root of hierarchy tree<br />Hierarchyid::Parse() – same as cast(@str as hierarchyid)<br />node.ToString() - Logical string representation of node’s hID<br />parent.GetDescendant(c1,c2)- returns a child node of parent between c1 and c2<br />node.GetAncestor(n) - hierarchyid of the nth ancestor of node<br />node.IsDescendantOf(pID) - true if node is a descendant of pID<br />node.GetLevel() – integer representing depth of node<br />node.GetReparentedValue(oldRoot,newRoot) – get path to newRoot for node who is descendant of oldRoot (use to move subt-trees within tree)<br />
  79. 79. Depth-first Index<br />Breadth-first Index<br />Order by level, then path<br />Useful for querying immediate children<br />Ordered by path<br />Useful for querying sub-trees<br />e.g. all files in a subfolder<br />HierarchyID Indexes<br />
  80. 80. Insert Root<br />Insert 1st Subordinate<br />Insert rest of tree<br />Query Hierarchical Data<br />Demo Structure<br />
  81. 81. HierarchyID<br /><ul><li>HierarchyID Basics
  82. 82. Trees and Hierarchies</li></li></ul><li>Temporal Data Types<br />Prior to SQL Server 2008<br />DATETIME<br />Range: 01-01-1753 to 31-12-9999<br />Accuracy: Rounded increments of 3.333ms<br />Storage: 8 bytes<br />SMALLDATETIME<br />Range: 01-01-1900 to 06-06-2079<br />Accuracy: 1 minute<br />Storage: 4 bytes<br />
  83. 83. Date/Time Types<br />SQL Server 2008<br />Save date and time in separate columns.<br /> Query all logins that occurred 18:00-7:00 during weekdays<br />Higher precision <br />Scientific data<br />Timezone offset<br /> Global applications<br />
  84. 84. DATE and TIME<br />DATE Data Type - Date Only<br />01-01-0001 to 31-12-9999 Gregorian Calendar<br />Takes only 3 bytes<br />TIME Data Type - Time Only<br />Variable Precision - 0 to 7 decimal places for seconds<br />Up to 100 nanoseconds<br />Takes 3-5 bytes, depending on precision<br />
  85. 85. DATETIME2 and DATETIMEOFFSET<br />DATETIME2 Data Type<br />01-01-0001 to 31-12-9999 Gregorian Calendar<br />Variable Precision - to 100 nanoseconds<br />Takes 6-8 bytes (same or less than DATETIME!)<br />DATETIMEOFFSET<br />01-01-0001 to 31-12-9999 Gregorian Calendar<br />Variable Precision - to 100 nanoseconds<br />Time Zone Offset (From UTCTime) Preserved<br />No Daylight Saving Time Support<br />Takes 8-10 bytes<br />
  86. 86. Date/Time Types Compatibility<br />New Data Types Use Same T-SQL Functions<br />DATENAME (datepart, date)<br />DATEPART (datepart,date)<br />DATEDIFF (datepart, startdate, enddate)<br />DATEADD (datepart, number, date)<br />Datepart can also be microsecond, nanosecond, TZoffset<br />MONTH<br />DAY<br />YEAR<br />CONVERT<br />
  87. 87. Date Time Library Extensions<br />Current date/time in higher precision <br />SYSDATETIME<br />SYSUTCDATETIME<br />SYSDATETIMEOFFSET<br />Original date/time uses<br />GETDATE, GETUTCDATE, CURRENT_TIMESTAMP<br />ISDATE(datetime/smalldatetime)<br />Special functions for DATETIMEOFFSET<br />SWITCHOFFSET(datetimeoffset, timezone)<br />TODATETIMEOFFSET(datetime, timezone)<br />
  88. 88. Temporal Data Types<br /><ul><li>New date and time features</li></li></ul><li>Spatial Data Type<br />Spatial data provides answers to location-based queries<br />Which roads intersect the Microsoft campus?<br />Does my land claim overlap yours?<br />List all of the Italian restaurants within 5 kilometers<br />Spatial data is part of almost every database<br />If your database includes an address<br />
  89. 89. Spatial Data Type<br />Represent geometric data:<br />Point<br />Lines<br />Polygons<br />Multi-point/line/polygon<br />
  90. 90. Spatial Data Type<br />SQL Server supports two spatial data types<br />GEOMETRY - flat earth model<br />GEOGRAPHY - round earth model<br />Both types support all of the instanciable OGC types<br />InstanceOf method can distinguish between them<br />
  91. 91. Spatial Data Type - Input<br />Spatial data is stored in a proprietary binary format<br />Instance of the type can be NULL<br />Can be input as<br />Well Known binary - ST[Type]FromWKB<br />Well Known text - ST[Type]FromText<br />Geography Markup Language (GML) - GeomFromGml<br />Can also use SQLCLR functions<br />Parse<br />Point - extension function<br />Input from SQLCLR Type - SqlGeometry, SqlGeography<br />
  92. 92. Spatial Data Type - Output<br />Spatial Data Can Be Output As<br />Well Known binary - STAsBinary<br />Well Known text - STAsText<br />GML - AsGml<br />Text with Z and M values - AsTextZM<br />SQLCLR standard method<br />ToString - returns Well Known text<br />As SQLCLR object - SqlGeometry, SqlGeography<br />Other useful formats are GeoRSS, KML<br />Not Directly Supported<br />
  93. 93. Useful Spatial Methods & Properties<br />
  94. 94. Spatial Data Types<br /><ul><li>New spatial features</li></li></ul><li>Management Studio Enhancements<br />Intellisense<br />Finally built in!<br />Only available when querying 2008 instances<br />Word Completion<br />Quick Info<br />Syntax errors<br />Doesn’t work in SQLCMD mode!<br />Can be turned off<br />Code Collapse&Expand<br />
  95. 95. Management Studio Enhancements<br />Multi Instance Queries<br />Quickly run SQL statements on multiple instances<br />Downside <br />if you wish to insert the dataset to a single table, you better find some other tool.<br />Useful for ad-hoc queries, not more than that…<br />Custom status bar color (per server)<br />Quick trace<br />New Activity Monitor<br />
  96. 96. Reporting Services 2008 Enhancements<br /><ul><li>Architecture</li></ul>Reporting engine no longer requires IIS<br />Better memory management<br /><ul><li>Reacts better to memory pressure</li></ul>Better Processing<br /><ul><li>On-demand processing redesigned for scalability
  97. 97. Page to page response time is constant</li></li></ul><li>Processing ImprovementsBenefits<br />
  98. 98. Tablix Control<br /><ul><li>What is it?</li></ul>Best of both table data-region and matrix data-region<br />Allows fixed and dynamic columns and rows<br />Enables Arbitrary nesting on each axis<br />Enables multiple parallel row/column members at each level<br />Introduces optional omission of row/column headers<br />
  99. 99. Parallel Dynamic Groups<br />2005<br />2008<br />
  100. 100. Mixed Dynamic & Static columns <br />2005<br />2008<br />
  101. 101. Enriched Visualizations – Dundas Charts<br />
  102. 102. Enriched Visualizations – Dundas Gauges<br />
  103. 103. <ul><li>Export to word
  104. 104. Report Builder 2</li></ul>Reporting Services 2008 Usability features<br />
  105. 105. Why?<br /><ul><li>Cost of storage rises with database size
  106. 106. High-end disks are expensive
  107. 107. Multiple copies required – test, HA, backups
  108. 108. Cost of managing storage rises with database size
  109. 109. Time taken for backups and maintenance operations (IO bound)
  110. 110. Time taken to restore backups in a disaster
  111. 111. Migration from other platforms (Oracle or DB2) that support compression is not possible
  112. 112. Compressing data leads to better memory utilization
  113. 113. SQL Server needs data compression!!</li></ul> (But only in Enterprise edition…)<br />Data Compression<br />
  114. 114. Data Compression<br />Before SQL Server 2008<br /><ul><li>(n)varchar, varbinary – no trailing spaces or zeroes saved (unlike char, binary)
  115. 115. SQL Server 2005 SP2 introduces vardecimal
  116. 116. Same solution – variable length column, empty bytes are not stored
  117. 117. But on a smaller scale:
  118. 118. (n)char/binary’s maximum capacity is 8000 bytes.
  119. 119. Decimal’s maximum capacity is 17 bytes (percision 29-38)
  120. 120. (but if you have millions of decimal cells in a fact table, you can benefit from compression)</li></li></ul><li>Data Compression<br />New in SQL Server 2008<br /><ul><li>Row Compression
  121. 121. Compresses fixed-length data types by turning them into variable-length (a step up from vardecimal).
  122. 122. Row meta data (row header, null bitmap) is also saved in a new variable-length format.
  123. 123. Page Compression
  124. 124. Row compression
  125. 125. Prefix compression
  126. 126. Dictionary compression</li></li></ul><li>Row Compression<br />Row compression<br />CREATE TABLE Countries (ID INT, Name CHAR(50))<br />
  127. 127. Row Compression<br />Row compression<br />CREATE TABLE Countries (ID INT, Name CHAR(50))<br />4 bytes<br />50 bytes<br />(4b + 50b) * 3 rows = 162 bytes (not including row overhead)<br />
  128. 128. Row Compression<br />Row compression<br />CREATE TABLE Countries (ID INT, Name CHAR(50))<br />50 bytes<br />4 bytes<br />Row Compression<br />1 byte<br />7 bytes<br />1 byte<br />11 bytes<br />1 byte<br />6 bytes<br />
  129. 129. Row Compression<br />Row compression<br />CREATE TABLE Countries (ID INT, Name CHAR(50))<br />1 byte<br />7 bytes<br />1 byte<br />11 bytes<br />1 byte<br />6 bytes<br />162 bytes reduced to:<br />1+7+1+11+1+6 = 27 bytes <br />(not including row overhead and 4 bits per column for offsets)<br />
  130. 130. Page Compression<br />1st phase: Row compression<br />2nd phase: Prefix compression<br /><ul><li>The general idea is to look for repeating patterns at the beginning of each value in each column.
  131. 131. The largest value for each column is stored in the compression information structure (CI).
  132. 132. The in-row values are replaced with indicators of full or partial matches with the value in the CI.
  133. 133. The process uses byte-level comparisons across all data types.</li></li></ul><li>2nd phase: Prefix compression example<br />Prefix compression<br />Page Compression<br />
  134. 134. 2nd phase: Prefix compression example<br />Prefix compression<br />Page Compression<br />
  135. 135. Page Compression<br />3rd phase: Dictionary compression<br /><ul><li>The whole page is scanned looking for common values, which are stored in the CI area on the page.
  136. 136. The in-row values are replaced with pointers to the CI area.</li></ul>Dictionary compression<br />
  137. 137. Estimating space savings<br />EXEC sp_estimate_data_compression_savings<br /><ul><li>Returns current size and estimated compressed size for a table, an index or a partition of them, according to selected compression type (row or page).
  138. 138. It creates a sampled subset of the data in tempdb and compresses it using the requested compression mechanism to get the estimate.</li></li></ul><li>Enabling and Disabling Data Compression<br /><ul><li>Data compression can be set on heaps, clustered indexes, non-clustered indexes and their partitions (including indexes on views).
  139. 139. Setting data compression on a table (CREATE/ALTER TABLE ) only affects the heap or the clustered index.
  140. 140. Data compression has to be set for each non-clustered index individually.</li></li></ul><li>Enabling and Disabling Data Compression<br /><ul><li>Syntax:</li></ul>CREATE/ALTER TABLE … REBUILD WITH <br /> ([PARTITION = ALL/partition_number,] <br />DATA_COMPRESSION = NONE/ROW/PAGE)<br />CREATE/ALTER INDEX … REBUILD WITH<br /> (DATA_COMPRESSION = NONE/ROW/PAGE<br /> [ON PARTITIONS (partition_number/range])<br />
  141. 141. When is data compressed?<br />With ROW compression, compression occurs row by row, whenever a row is inserted or updated.<br />With PAGE compression, it gets complicated.<br />In heaps, PAGE compression only occurs in the following ways:<br />On table REBUILD <br />When data is inserted with BULK INSERT<br />When data is inserted with INSERT INTO … WITH(TABLOCK)<br />
  142. 142. When is data compressed?<br />In indexes, pages are only ROW compressed until they fill up. <br />When the next insert occurs, it triggers PAGE compression.<br />If there’s space left for the new row after PAGE compression, the row is compressed and inserted into the newly compressed page.<br />If there’s no space, the page is not compressed and the row will be inserted on a new page<br />
  143. 143. Monitoring data compression<br /><ul><li>Compression states:</li></ul>data_compression column in sys.partitions<br /><ul><li>sys.dm_db_index_operational_stats</li></ul>page_compression_attempt_count/<br />page_compression_success_count<br /><ul><li>sys.dm_db_index_physical_stats</li></ul>compressed_page_count<br /><ul><li>New System Monitor counters for the whole server</li></ul> Page compression attempts/sec<br /> Pages compressed/sec<br />
  144. 144. Data Compression: Limitations<br />Data compression is Enterprise Edition only<br />Data compression cannot be used in conjunction with sparse columns<br />Data compression doesn’t increase the capacity per row. You can’t insert more data per row with data compression.<br />This ensures that disabling compression will always succeed<br />
  145. 145. Data Compression: Limitations<br />Non-leaf level pages in indexes are only compressed using ROW compression<br />Heap pages are not PAGE compressed during regular DML<br />LOB values out-of-row are not compressed<br />Data exported using BCP is always uncompressed.<br />Data imported using BCP will be compressed causing increased CPU usage.<br />
  146. 146. <ul><li>Estimate Space Savings
  147. 147. Compress Index and heap
  148. 148. Compare Performance
  149. 149. Query DMVs</li></ul>Data Compression<br />
  150. 150. Data Compression Performance<br />Generalization<br />No noticeable impact on INDEX SEEKs<br />Noticeable impact on large data modification operations.<br />Lower IO, higher CPU on index/table scans. Duration depends on overall system configuration.<br />Data compression should be tested thoroughly before setting up in production<br />
  151. 151. So when should I use it?<br />Read-intensive databases with low CPU usage<br />Systems with small Buffer cache (relative to data size)<br />Databases queried for few rows at a time (i.e. index seeks).<br />There’s always a trade-off. Remember the equation:<br />Data compression = Less IO = Smaller memory footprint<br />But also<br />Data compression = Higher CPU usage<br />
  152. 152. Data Compression - Numbers<br /><ul><li>Typical results: your mileage may vary </li></li></ul><li>Business cost reduction: SQL Server 2005 Resource management<br />Backup<br />OLTP Activity<br />Admin Tasks<br />Executive Reports<br />Ad-hoc Reports<br />Workloads<br />SQL Server<br /><ul><li>Single resource pool
  153. 153. No workload differentiation</li></ul>Memory, CPU, Threads, …<br />Resources<br />
  154. 154. Executive<br />Reports<br />Backup<br />OLTP <br />Activity<br />Admin Tasks<br />Ad-hoc<br />Reports<br />High<br />OLTP Workload<br />Admin Workload<br />Report Workload<br />Min Memory 10%<br />Max Memory 20%<br />Max CPU 20%<br />Max CPU 90%<br />Application Pool<br />Admin Pool<br />Business cost reduction: SQL Server 2008 Resource management<br />SQL Server<br /><ul><li>Helps differentiate workloads (e.g. by application name/user name)
  155. 155. Limit resource usage (e.g. CPU, memory, simultaneous requests)
  156. 156. Prevent run-away queries
  157. 157. Limit resource usage of administrative tasks
  158. 158. Resource monitoring (dmvs, performance monitor, trace events)</li></li></ul><li>What else is new?<br />SQL Server Change Tracking<br />Synchronized Programming Model<br />SQL Server Conflict Detection<br />FILESTREAM data type<br />Integrated Full Text Search<br />Sparse Columns<br />Large User Defined Types<br />Date/Time Data Type<br />SPATIAL data type<br />Virtual Earth Integration<br />Partitioned Table Parallelism<br />Query Optimizations<br />Persistent Lookups <br />Change Data Capture <br />Policy Based Management<br />Backup Compression<br />MERGE SQL Statement<br />Data Profiling<br />Star Join Optimization<br />Enterprise Reporting Engine<br />Internet Report Deployment<br />Block Computations<br />Scale out Analysis<br />BI Platform Management<br />Export to Word and Excel<br />Author reports in Word and Excel<br />Report Builder Enhancements<br />TABLIX<br />Rich Formatted Data<br />Personalized Perspectives<br />Filtered Indexes<br />Filtered Statistics<br />… and many more<br />Transparent Data Encryption<br />External Key Management<br />Data Auditing<br />Pluggable CPU<br />Transparent Failover for Database Mirroring<br />Declarative Management Framework<br />Server Group Management<br />Streamlined Installation<br />Enterprise System Management<br />Performance Data Collection<br />System Analysis<br />Data Compression<br />Query Optimization Modes<br />Resource Governor<br />Entity Data Model<br />LINQ<br />Visual Entity Designer<br />Entity Aware Adapters<br />
  159. 159. Thank you!<br />Any questions?<br />
  160. 160. References<br />SQL Server 2008 Developer Training Kit<br />Introduction to New T-SQL Programmability Features in SQL Server 2008 / Itzik Ben Gan<br />Using The Resource Governor / Aaron Bertrand<br />Reporting Services in SQL Server 2008 / Ann Weber <br />Data Compression: Strategy, Capacity Planning and Best Practices / Sanjay Mishra<br />