SQL Server 2012 Beyond Relational Performance and Scale

Structured and
unstructured Search

Related/”Semantic”
Search

Building and Maintaining Applications with
relational and non-relational data is hard
Pain Complex integration
Duplicated functionality
Points Compensation for unavailable services

Reduce the cost of managing all data
Simplify the development of applications
Goals over all data
Provide management and programming
services for all data

Tables, XML, Spatial, Documents, Digital Media, Scientific
Records, Factoids…

Data formats and content natively understood for rich application and
user experience
Consistent Application Model and Data Constructs to ease application
development, migration and long-term retention

Provide rich services, e.g.,

Programmability

T-SQL

Query

Structured
Data

B-trees

Manageability
Availability
Files

Programmability

T-SQL

Query Search

Structured Unstructured
Data Data

B-trees

Manageability
Availability
Files

Programmability
Spatial, XML,
T-SQL/Data Types HierarchyID
Win 32
Query and XQuery
Search
Type Operations Spatial ops
Semi-
Structured Unstructured
structured
Data Data
Data/XML
XML, FTS, Spatial
B-trees Indices
Filestrea
Manageability m
Availability
Files

Rich Data Programmability
Programming
Spatial, XML,
Capabilities T-SQL/Data Types HierarchyID

Win 32
Rich Query and Query and Type
Search
Search Services Operations
XQuery
over all Data Spatial ops Semantic
Platform

Efficient Storage Structured Semi-structured Unstructured
for BR Data Data Data/XML Data
XML, FTS, Spatial
B-trees Indices
Filestream
Manageability& Availability
Files

Transactional Access Streaming Win32 Access
Streaming Win32 Access??
Database Applications Windows Apps SQL Apps

Blobs SMB Share FileStream
Files/Folders API

Rich Services

Fulltext Search Database

Solutions
Scale-up
Semantic Similarity Disk Disk Disk

FileTable
1 2 3

FileStreams
Search
Multiple Containers

Integrated Administration?
Integrated Administration Remote BLOB Storage
Customer Application
SQL RBS API
D
D Centera SQL
B FileStre Azure lib lib FILESTREAM lib
B FileStreams

Integrated Azure Centera SQL DB
Backup/Replication/AlwaysOn

Store BLOBs in
DB + File System
Application

BLOB

DB

FileTable Folder Hierarchy
FILESTREAM
Share
MSSQLSERVER

my_machine
Database MSSQLSERVEROffice
Directories DocsDocuments
Private Docs Office Docs
(Database1) (Database2)

FileTable
Directories
Media Documents LogFiles
(FileTable) (FileTable) (FileTable)
User-Defined
Directory
Structure

900

800

700
Filestream Win32
(Filesystem)
Throughput (Mbps)

600 Access
Filestream T-SQL
500

400 Varbinary

300
Filesystem Win32
200 Access Gain (%)
100

0
240 KB 480 KB 1 MB 2 MB 4 MB 8 MB

Insert
600

Filestream
500 Win32
(Filesystem)
400 Access
Filestream T-
Throughput (Mbps)

300 SQL

200
Varbinary
100

0
240 KB 480 KB 1 MB 2 MB 4 MB 8 MB

-100

-200

Create/Alter Database
max_size
DBCC Shrinkfile Emptyfile

Use of multiple spindles for achieving better I/O Scalability

Queries over 350M documents database and random DMLs running in background.
Beating SQL Server 2005 with a scale factor more than 2x and with avg 60x times better throughput

2005/8 vs 2012

2005/8

2012

Query avgExecTime (ms) under various number of connections (50 ~ 2000 users) for customer
playback benchmark

geography::Point(lat, lon, 4326)

C
B D A B A B
D A
Primary Filter Secondary Filter
E (Index lookup) (Original predicate)

In general, split predicates in two
Primary filter finds all candidates, possibly
with false positives (but never false negatives)
Secondary filter removes false positives
The index provides our primary filter
Original predicate is our secondary filter
Some tweaks to this scheme
Sometimes possible to skip secondary filter

Fully contained
cells
Partially
contained cells

Optimal value (theoretical) is
somewhere between two extremes

Default values: Time needed to
512 - Geometry AUTO grid process false positives
768 - Geography AUTO grid
1024 - SELECT * FROM table t WITH
MANUAL grids (SPATIAL_WINDOW_MAX_CELLS=256)
WHERE t.geom.STIntersects(@window)=1;

CREATE SPATIAL INDEX idxGeog
ON table(geography column)
USING GEOGRAPHY_GRID
WITH (
DATA_COMPRESSION = page | row
);

On the basis of internal tests, with compression
- 40%-50% smaller
- 20% faster -15% slower queries
- Per partition compression setting is not supported.

Give me the closest 5 Italian restaurants

SQL Server 2008/2008 R2: table scan
SQL Server 2012: uses spatial index

SELECT TOP(5) *
FROM Restaurants r
WHERE r.type = ‘Italian’
AND r.pos.STDistance(@me) IS NOT NULL
ORDER BY r.pos.STDistance(@me)

Find the closest 50 business points to a specific location (out of 22 million in total)

http://www.slideshare.net/MichaelRys/sql-bits-brruds
http://www.slideshare.net/MichaelRys/filetable-and-semantic-search-in-sql-server-2012
http://www.sqlserverlaunch.com/WW/theater?sid=634

http://www.slideshare.net/MichaelRys/sqlbits-x-sql-server-2012-spatial
http://www.slideshare.net/MichaelRys/sqlbits-x-sql-server-2012-spatial-indexing
Forum:
http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=1629&SiteID=1

SQL Server 2012 Beyond Relational Performance and Scale

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to SQL Server 2012 Beyond Relational Performance and Scale

Similar to SQL Server 2012 Beyond Relational Performance and Scale (20)

More from Michael Rys

More from Michael Rys (20)

Recently uploaded

Recently uploaded (20)

SQL Server 2012 Beyond Relational Performance and Scale

Editor's Notes