10. Agenda T-SQL enhancements MERGE Statement Table Valued Parameters Grouping Sets Compound Assignments Table value constructors New data types Filestream HierarchyID Temporal data types Spatial data types SSMS Enhancements Intellisense Multi-instances query Registered Servers properties SSRS Enhancements More new features Data compression Resource Governor
12. MERGE Statement What is it? MERGE allows you to compare two tables and apply changes to one of them according to matching and non-matching rows Allows multiple set operations in a single SQL statement Operations can be INSERT, UPDATE, DELETE ANSI SQL 2006 compliant - with extensions
13. What can I do with it? ETL processes “Update if exists, insert otherwise” stored procedures Data comparison, verification and modification MERGE Statement
14. MERGE Syntax MERGE [INTO] target_table AS target USING source_table AS source (table, view, derived table) ON <merge_search_conditions> WHEN MATCHED [AND <other predicates>] UPDATE SET target.col2 = source.col2 (or delete) WHEN NOT MATCHED [BY TARGET] [AND <other predicates>] INSERT [(col_list)] VALUES (col_list) WHEN NOT MATCHED BY SOURCE [AND <other predicates>] DELETE (or update) OUTPUT $action, inserted.col, deleted.col, source.col; Get used to semicolons!
15. MERGE Statement $action function in OUTPUT clause Multiple WHEN clauses possible For MATCHED and NOT MATCHED BY SOURCE Only one WHEN clause for NOT MATCHED Rows affected includes total rows affected by all clauses
16. MERGE Performance MERGE statement is transactional No explicit transaction required One pass through tables At most a full outer join Matching rows (inner join) = when matched Left-outer join rows = when not matched Right-outer join rows = when not matched by source When optimizing, optimize joins Index columns used in ON clause (if possible, unique index on source table)
17. MERGE and Determinism UPDATE using a JOIN is non-deterministic If more than one row in source matches ON clause, either/any row can be used for the UPDATE. MERGE is deterministic If more than one row in source matches ON clause, an exception is thrown.
18. MERGE & Triggers When there are triggers on the target table Trigger is raised per DML (insert/update/delete) “Inserted” and “deleted” tables are populated In much the same way, MERGE is treated by replication as a series of insert, update and delete commands.
19. Where is MERGE useful? Replace UPDATE … JOIN and DELETE … JOIN statements ETL Processes Set comparison Update-if-exists, insert-otherwise procedures IF EXISTS (SELECT … FROM tbl) UPDATE tbl … ELSE INSERT …
20.
21. Table Types SQL Server has table variables DECLARE @t TABLE (id int); SQL Server 2008 adds strongly typed table variables CREATE TYPE mytab AS TABLE (id int); DECLARE @t mytab; Parameters must use strongly typed table variables
22. Table Variables are Input Only Declare and initialize TABLE variable DECLARE @t mytab; INSERT @t VALUES (1), (2), (3); EXEC myproc @t; Parameter must be declared READONLY CREATE PROCEDURE usetable ( @t mytabREADONLY ...) AS INSERT INTO lineitems SELECT * FROM @t; UPDATE @t SET... -- no!
23. TVP Implementation and Performance Table Variables materialized in TEMPDB Faster than parameter arrays, BCP APIs still fastest Duration (ms) Number of rows passed
24.
25. Grouping Sets GROUP BY Extensions GROUPING SETS Specifies multiple groupings of data in one query CUBE * All permutations in column set ROLLUP* Sub totals, super aggregates and grand total. * don’t confuse with old nonstandard WITH CUBE and WITH ROLLUP options
26. Grouping_ID() New function that computes the level of grouping Takes a set of columns as parameters and returns the grouping level Helps identify the grouping set that each result row belongs to. GROUPING(col) function returns 1 bit: 1 if the result is grouped by the column, 0 otherwise.
27.
28. Table Value Constructors Use the VALUES clause to construct a set of rows to insert multiple rows or as a derived tables INSERT INTO dbo.Customers(custid, companyname, phone, address) VALUES (1, 'cust 1', '(111) 111-1111', 'address 1'), (2, 'cust 2', '(222) 222-2222', 'address 2'), (3, 'cust 3', '(333) 333-3333', 'address 3'); SELECT * FROM ( VALUES (CAST('20090730' AS DATE),'TishaB''Av') ,(CAST('20090918' AS DATE),'Erev Rosh HaShana') ,(CAST('20090919' AS DATE),'Rosh HaShana') ) AS holidays (HolidayDate,description)
29. FILESTREAM Storage To Blob Or Not To Blob? Cons LOBS take memory buffers Updating LOBS causes fragmentation Poor streaming capabalities Pros Transactional consistency Point-in-time backup & restore Single storage and query vehicle ?
57. FILESTREAM Programming Dual Programming Model TSQL (Same as SQL BLOB) Win32 Streaming File IO APIs Begin a SQL Server Transaction Obtain a symbolic PATH NAME & TRANSACTION CONTEXT Open a handle using sqlncli10.dll - OpenSqlFilestream Use Handle Within System.IO Classes Commit Transaction
58. // 1. Start up a database transaction – SqlTransactiontxn = cxn.BeginTransaction(); // 2. Insert a row to create a handle for streaming. newSqlCommand("INSERT <Table> VALUES ( @mediaId, @fileName, @contentType);",cxn, txn); // 3. Get a filestreamPathName & transaction context. newSqlCommand("SELECT PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT() FROM <Table>", cxn, txn); // 4. Get a Win32 file handle using SQL Native Client call. SafeFileHandle handle = SqlNativeClient.OpenSqlFilestream(...); // 5. Open up a new stream to write the file to the blob. FileStreamdestBlob= newFileStream(handle, FileAccess.Write); // 6. Loop through source file and write to FileStream handle while ((bytesRead = sourceFile.Read(buffer, 0, buffer.Length)) > 0) {destBlob.Write(buffer, 0, bytesRead);} // 7. Commit transaction, cleanup connection. – txn.Commit(); FILESTREAM Programming
59. FILESTREAM Implementation Server & Instance level Enable filestream (in setup or configuration manager) Make sure port 445 (SMB) is open if remote access is used Exec sp_configure'filestream_access_level', [0/1/2] At Database Level Create a filestreamfilegroup & map to directory At Table Level Define VARBINARY(MAX) FILESTREAM column(s) Must have UNIQUEIDENTIFIER column (and file extension column if FTS is used)
60. FILESTREAM Implementation Integrated security ACLs (NT permissions) granted only to SQL Server service account. Permissions to access files implied by granting read/write permissions on FILESTREAM column in SQL Server. Naturally, only Windows authentication is supported. Integrated management BACKUP and RESTORE (for both database and log) also backup and restore FILESTREAM data. Note that in the FULL RECOVERY MODEL, “deleted” files are not deleted until log is backed up.
61. FILESTREAM Limitations Not supported Remote FILESTREAM storage Database snapshot and Mirroring Supported: Replication (with SQL Server 2008 subscribers) Log shipping (with SQL Server 2008 secondaries) SQL Server Express Edition (4gb size limit does not apply to FILESTREAM data container) Features not integrated SQL Encryption Table Value Parameters
71. Bad performance with large trees.2005 Alternatives - Adjacency Model
72.
73.
74.
75.
76. HierarchyID New data type in SQL Server 2008 Uses path enumeration, but in a much more efficient binary representation. Exposes methods to query the tree and set relations between nodes. Remember that it’s just data It’s up to the application to assign correct hierarchyID values to the nodes to represent the relations (using HierarchyID methods) It’s up to the developer/DBA to create a unique constraint on the hierarchyID column
77. HierarchyID Nodes are ordered Root: / First child: /1/ Second Child: /2/ First grandchild: /1/1/ Second: /1/2/ If you want to insert another grandchild between them - /1/1.1/
78. HierarchyID Methods hierarchyid::GetRoot() – get root of hierarchy tree Hierarchyid::Parse() – same as cast(@str as hierarchyid) node.ToString() - Logical string representation of node’s hID parent.GetDescendant(c1,c2)- returns a child node of parent between c1 and c2 node.GetAncestor(n) - hierarchyid of the nth ancestor of node node.IsDescendantOf(pID) - true if node is a descendant of pID node.GetLevel() – integer representing depth of node node.GetReparentedValue(oldRoot,newRoot) – get path to newRoot for node who is descendant of oldRoot (use to move subt-trees within tree)
79. Depth-first Index Breadth-first Index Order by level, then path Useful for querying immediate children Ordered by path Useful for querying sub-trees e.g. all files in a subfolder HierarchyID Indexes
80. Insert Root Insert 1st Subordinate Insert rest of tree Query Hierarchical Data Demo Structure
81.
82.
83. Date/Time Types SQL Server 2008 Save date and time in separate columns. Query all logins that occurred 18:00-7:00 during weekdays Higher precision Scientific data Timezone offset Global applications
84. DATE and TIME DATE Data Type - Date Only 01-01-0001 to 31-12-9999 Gregorian Calendar Takes only 3 bytes TIME Data Type - Time Only Variable Precision - 0 to 7 decimal places for seconds Up to 100 nanoseconds Takes 3-5 bytes, depending on precision
85. DATETIME2 and DATETIMEOFFSET DATETIME2 Data Type 01-01-0001 to 31-12-9999 Gregorian Calendar Variable Precision - to 100 nanoseconds Takes 6-8 bytes (same or less than DATETIME!) DATETIMEOFFSET 01-01-0001 to 31-12-9999 Gregorian Calendar Variable Precision - to 100 nanoseconds Time Zone Offset (From UTCTime) Preserved No Daylight Saving Time Support Takes 8-10 bytes
86. Date/Time Types Compatibility New Data Types Use Same T-SQL Functions DATENAME (datepart, date) DATEPART (datepart,date) DATEDIFF (datepart, startdate, enddate) DATEADD (datepart, number, date) Datepart can also be microsecond, nanosecond, TZoffset MONTH DAY YEAR CONVERT
87. Date Time Library Extensions Current date/time in higher precision SYSDATETIME SYSUTCDATETIME SYSDATETIMEOFFSET Original date/time uses GETDATE, GETUTCDATE, CURRENT_TIMESTAMP ISDATE(datetime/smalldatetime) Special functions for DATETIMEOFFSET SWITCHOFFSET(datetimeoffset, timezone) TODATETIMEOFFSET(datetime, timezone)
88.
89. Spatial Data Type Represent geometric data: Point Lines Polygons Multi-point/line/polygon
90. Spatial Data Type SQL Server supports two spatial data types GEOMETRY - flat earth model GEOGRAPHY - round earth model Both types support all of the instanciable OGC types InstanceOf method can distinguish between them
91. Spatial Data Type - Input Spatial data is stored in a proprietary binary format Instance of the type can be NULL Can be input as Well Known binary - ST[Type]FromWKB Well Known text - ST[Type]FromText Geography Markup Language (GML) - GeomFromGml Can also use SQLCLR functions Parse Point - extension function Input from SQLCLR Type - SqlGeometry, SqlGeography
92. Spatial Data Type - Output Spatial Data Can Be Output As Well Known binary - STAsBinary Well Known text - STAsText GML - AsGml Text with Z and M values - AsTextZM SQLCLR standard method ToString - returns Well Known text As SQLCLR object - SqlGeometry, SqlGeography Other useful formats are GeoRSS, KML Not Directly Supported
95. Management Studio Enhancements Multi Instance Queries Quickly run SQL statements on multiple instances Downside if you wish to insert the dataset to a single table, you better find some other tool. Useful for ad-hoc queries, not more than that… Custom status bar color (per server) Quick trace New Activity Monitor
127. Row Compression Row compression CREATE TABLE Countries (ID INT, Name CHAR(50)) 4 bytes 50 bytes (4b + 50b) * 3 rows = 162 bytes (not including row overhead)
128. Row Compression Row compression CREATE TABLE Countries (ID INT, Name CHAR(50)) 50 bytes 4 bytes Row Compression 1 byte 7 bytes 1 byte 11 bytes 1 byte 6 bytes
129. Row Compression Row compression CREATE TABLE Countries (ID INT, Name CHAR(50)) 1 byte 7 bytes 1 byte 11 bytes 1 byte 6 bytes 162 bytes reduced to: 1+7+1+11+1+6 = 27 bytes (not including row overhead and 4 bits per column for offsets)
130.
131. The largest value for each column is stored in the compression information structure (CI).
132. The in-row values are replaced with indicators of full or partial matches with the value in the CI.
133.
134. 2nd phase: Prefix compression example Prefix compression Page Compression
135.
136. The in-row values are replaced with pointers to the CI area.Dictionary compression
137.
138.
139. Setting data compression on a table (CREATE/ALTER TABLE ) only affects the heap or the clustered index.
140.
141. When is data compressed? With ROW compression, compression occurs row by row, whenever a row is inserted or updated. With PAGE compression, it gets complicated. In heaps, PAGE compression only occurs in the following ways: On table REBUILD When data is inserted with BULK INSERT When data is inserted with INSERT INTO … WITH(TABLOCK)
142. When is data compressed? In indexes, pages are only ROW compressed until they fill up. When the next insert occurs, it triggers PAGE compression. If there’s space left for the new row after PAGE compression, the row is compressed and inserted into the newly compressed page. If there’s no space, the page is not compressed and the row will be inserted on a new page
143.
144. Data Compression: Limitations Data compression is Enterprise Edition only Data compression cannot be used in conjunction with sparse columns Data compression doesn’t increase the capacity per row. You can’t insert more data per row with data compression. This ensures that disabling compression will always succeed
145. Data Compression: Limitations Non-leaf level pages in indexes are only compressed using ROW compression Heap pages are not PAGE compressed during regular DML LOB values out-of-row are not compressed Data exported using BCP is always uncompressed. Data imported using BCP will be compressed causing increased CPU usage.
150. Data Compression Performance Generalization No noticeable impact on INDEX SEEKs Noticeable impact on large data modification operations. Lower IO, higher CPU on index/table scans. Duration depends on overall system configuration. Data compression should be tested thoroughly before setting up in production
151. So when should I use it? Read-intensive databases with low CPU usage Systems with small Buffer cache (relative to data size) Databases queried for few rows at a time (i.e. index seeks). There’s always a trade-off. Remember the equation: Data compression = Less IO = Smaller memory footprint But also Data compression = Higher CPU usage
160. References SQL Server 2008 Developer Training Kit Introduction to New T-SQL Programmability Features in SQL Server 2008 / Itzik Ben Gan Using The Resource Governor / Aaron Bertrand Reporting Services in SQL Server 2008 / Ann Weber Data Compression: Strategy, Capacity Planning and Best Practices / Sanjay Mishra