SQL Server 2012 Beyond Relational Performance and Scale


Published on

Pragmatic Works SQL Server 2012 Webinar presentation

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Let’s take a look at a BR application. What services does it provide. What about having these services supported in the database instead of each application building their own?
  • Examples: Manage an application that manages images in the file system and additional information in the databaseBuilding a spatial database application before SQL Server 2008Example services: Backup/restore, search over relational and non-relational data
  • Pure relational database system.
  • SQL Server 7.0: Added FT Search over unstructured data
  • SQL 2000: Starting to add XML supportSQL 2005: XML datatype, XQuery, XML IndicesSQL 2008: Spatialdatatype and ops, Spatial Indexing, Filestream with Win 32 (but requires special library to open/create), integrated FTS Filestream requires NTFS
  • As of SQL Server 2012:Exposing Win 32 natively through FileTableAddition of Semantic Platform to enable Semantic search (and eventually – post Denali - query)Efficient Storage: building on existing relational storage and indexing infrastructure and backup/restore/HA. Bring SQL Server’s superior TCO to BR data and assures efficient and safe storage of customer’s high-value dataRich Capabilities: Necessary (but not sufficent) programmability experience to move customers to entrust their high-value data to SQL with minimal migration pains and access it via their favorite programming model/API.Rich Services: Provide high-value services to unlock information in all data in a highly scalable way. Entices customers to move their high-value data into SQL to discover information fast. Provides platform stickiness and differentiation.
  • Focus in SQL Server 2012 in priority order:Capabilities and rich services for unstructured dataSpatial platformSustain existing BR supportToolingPerformance & ScaleOrthogonalityLarge new Features
  • Focus in SQL Server 2012 in priority order:Capabilities and rich services for unstructured dataSpatial platformSustain existing BR supportToolingPerformance & ScaleOrthogonalityLarge new Features
  • SQL 2008 provides Filestreams as a way add large blobs/unstructured data streams into SQL and still be able to open a Win32 handle (using SQL API) and provide high streaming performance for the data Win32 Namespace support in SQL Server 2012 has the following goals Reduce the barrier to entry for customers who have data in file servers and have Win32 applications that work on these currently. By enabling Win32 namespace, SQL will generate Windows Share that can be exposed to existing Win32 applications similar to any file server shares. This can allow Win32 applications/mid tier servers (like IIS) to work with this data without having to understand the database/transaction semantics Single integrated set of Admin tools – SQL backup/restore, Replication, HA solutions etc Scale up – Add multiple disks on a machine for storing Filestream data. Use SQL services like Full text search for both FileStream and relational metadata, Property Promotion Infrastructure fro extracting interesting properties from SQL blobs/filestream to surface as relational columns for query
  • Reading bigger buffers gives better performance FS volumeDedicated volumes means volumes not used for tempdb (non-OS, paging, SQL data & log volumes)If stored files are large as we generally recommend, format with 64K clustersDo compress filestream volumes or filestream containers, but ONLY if data to be stored is compressible. Note that in this case NTFS cluster size must be 4K.1 vol per container => enables space management at volume level.AV should be configured not to delete infected files but to quarantine them. Otherwise corruption will be reported.SMBWith 60KB: A read can happen in one single IO and ideally coming back in one single TCP-IP packet. It is not 64K because 64KB data can't fit in one single TCP/IP buffer.Partitioning:FILESTREAM columns require the presence of the ROWGUID unique index for aligned partitioning, or in case this is not possible, explicitly specifying the data placement option for the unique or primary key constraint on the ROWGUID column.
  • customer lab testing with 220 MB video files. FS win32 reads video streaming performance.FILESTREAM best practices.
  • Stats on inserts followed by reads.8.3 etc…
  • Optimized hot paths, removed unnecessary serialization, expensive FileSystem operations etc
  • Focus in SQL Server 2012 in priority order:Capabilities and rich services for unstructured dataSpatial platformSustain existing BR supportToolingPerformance & ScaleOrthogonalityLarge new Features
  • TB
  • Experimentation: For instance, consider this dataset: US Highways.  In this dataset some of the LineStrings are quite long (over 2000 miles) and others are quite short (400 meters or less). For optimal performance, the following two indexes were roughly equivalent:Geography Index: MEDIUM, MEDIUM, MEDIUM, MEDIUM 1024Geometry Index: LOW, LOW, LOW, LOW 1024
  • SQL Server 2012 Beyond Relational Performance and Scale

    1. 1. Structured andunstructured SearchRelated/”Semantic”Search
    2. 2. Building and Maintaining Applications with relational and non-relational data is hard Pain Complex integration Duplicated functionalityPoints Compensation for unavailable services Reduce the cost of managing all data Simplify the development of applicationsGoals over all data Provide management and programming services for all data
    3. 3. Tables, XML, Spatial, Documents, Digital Media, ScientificRecords, Factoids…Data formats and content natively understood for rich application anduser experienceConsistent Application Model and Data Constructs to ease applicationdevelopment, migration and long-term retentionProvide rich services, e.g.,
    4. 4. Programmability T-SQL Query Structured Data B-treesManageability Availability Files
    5. 5. Programmability T-SQL Query SearchStructured Unstructured Data Data B-trees Manageability Availability Files
    6. 6. Programmability Spatial, XML, T-SQL/Data Types HierarchyID Win 32Query and XQuery SearchType Operations Spatial ops Semi- Structured Unstructured structured Data Data Data/XML XML, FTS, Spatial B-trees Indices Filestrea Manageability m Availability Files
    7. 7. Rich Data ProgrammabilityProgramming Spatial, XML,Capabilities T-SQL/Data Types HierarchyID Win 32Rich Query and Query and Type SearchSearch Services Operations XQueryover all Data Spatial ops Semantic PlatformEfficient Storage Structured Semi-structured Unstructuredfor BR Data Data Data/XML Data XML, FTS, Spatial B-trees Indices Filestream Manageability& Availability Files
    8. 8. Transactional Access Streaming Win32 Access Streaming Win32 Access?? Database Applications Windows Apps SQL Apps Blobs SMB Share FileStream Files/Folders API Rich Services Fulltext Search Database Solutions Scale-upSemantic Similarity Disk Disk Disk FileTable 1 2 3 FileStreams Search Multiple Containers Integrated Administration? Integrated Administration Remote BLOB Storage Customer Application SQL RBS API D D Centera SQL B FileStre Azure lib lib FILESTREAM lib B FileStreams Integrated Azure Centera SQL DB Backup/Replication/AlwaysOn
    9. 9. Store BLOBs inDB + File System Application BLOB DB
    10. 10. FileTable Folder HierarchyFILESTREAMShare MSSQLSERVER my_machineDatabase MSSQLSERVEROfficeDirectories DocsDocuments Private Docs Office Docs (Database1) (Database2)FileTableDirectories Media Documents LogFiles (FileTable) (FileTable) (FileTable)User-DefinedDirectoryStructure
    11. 11. 900 800 700 Filestream Win32 (Filesystem)Throughput (Mbps) 600 Access Filestream T-SQL 500 400 Varbinary 300 Filesystem Win32 200 Access Gain (%) 100 0 240 KB 480 KB 1 MB 2 MB 4 MB 8 MB
    12. 12. Insert 600 Filestream 500 Win32 (Filesystem) 400 Access Filestream T-Throughput (Mbps) 300 SQL 200 Varbinary 100 0 240 KB 480 KB 1 MB 2 MB 4 MB 8 MB -100 -200
    13. 13. Create/Alter Database max_sizeDBCC Shrinkfile Emptyfile
    14. 14. Use of multiple spindles for achieving better I/O Scalability
    15. 15. 2012 2012
    16. 16. Queries over 350M documents database and random DMLs running in background.Beating SQL Server 2005 with a scale factor more than 2x and with avg 60x times better throughput
    17. 17. 2005/8 vs 2012 2005/8 2012Query avgExecTime (ms) under various number of connections (50 ~ 2000 users) for customerplayback benchmark
    18. 18. geography::Point(lat, lon, 4326)
    19. 19. C B D A B A BD A Primary Filter Secondary Filter E (Index lookup) (Original predicate)In general, split predicates in two Primary filter finds all candidates, possibly with false positives (but never false negatives) Secondary filter removes false positivesThe index provides our primary filterOriginal predicate is our secondary filterSome tweaks to this scheme Sometimes possible to skip secondary filter
    20. 20. Fully contained cells Partiallycontained cells
    21. 21. Optimal value (theoretical) is somewhere between two extremes Default values: Time needed to512 - Geometry AUTO grid process false positives768 - Geography AUTO grid1024 - SELECT * FROM table t WITH MANUAL grids (SPATIAL_WINDOW_MAX_CELLS=256) WHERE t.geom.STIntersects(@window)=1;
    22. 22. CREATE SPATIAL INDEX idxGeog ON table(geography column) USING GEOGRAPHY_GRID WITH ( DATA_COMPRESSION = page | row );On the basis of internal tests, with compression- 40%-50% smaller - 20% faster -15% slower queries- Per partition compression setting is not supported.
    23. 23. Give me the closest 5 Italian restaurants SQL Server 2008/2008 R2: table scan SQL Server 2012: uses spatial indexSELECT TOP(5) *FROM Restaurants rWHERE r.type = ‘Italian’ AND r.pos.STDistance(@me) IS NOT NULLORDER BY r.pos.STDistance(@me)
    24. 24. Find the closest 50 business points to a specific location (out of 22 million in total)
    25. 25. http://www.slideshare.net/MichaelRys/sql-bits-brruds http://www.slideshare.net/MichaelRys/filetable-and-semantic-search-in-sql-server-2012 http://www.sqlserverlaunch.com/WW/theater?sid=634 http://www.slideshare.net/MichaelRys/sqlbits-x-sql-server-2012-spatial http://www.slideshare.net/MichaelRys/sqlbits-x-sql-server-2012-spatial-indexingForum: http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=1629&SiteID=1