SlideShare a Scribd company logo
Inside SQL Server
In-Memory OLTP
Bob Ward, Principal Architect, Microsoft,
Data Group, Tiger Team
Latch free since 2007
Based on this whitepaper, internal
docs, source code, and reverse
engineering
Credits to Kalen
Delaney, Jos de Bruijn,
Sunil Agarwal, Cristian
Diaconu, Craig
Freedman, Alejandro
Hernandez Saenz, Jack
Li, Mike Zwilling and
the entire Hekaton
team
deck and demos at http://aka.ms/bobwardms
This is a 500 Level Presentation
It is ONE talk with a 15min break for 3 hours
=
Great primer
and
customer
stories by
@jdebruijn
Get and run the demo
yourself from github
Let’s Do This
What, How, Why, and Architecture
Data and Indexes
Concurrency, Transactions, and Logging
Checkpoint and Recovery
Natively Compiled Procedures
Resource Management
Based on SQL
Server 2016
Check out the Bonus
Material Section
if time
The Research behind the concept
OLTP Through the Looking Glass,
and What We Found There
by Stavros Harizopoulos, Daniel J.
Abadi, Samuel Madden, Michael
Stonebraker , SIGMOD 2008
TPC-C hand-coded on top of the
SHORE storage engine
The Path to In-Memory OLTP
Project Verde
• 2007
“Early”
Hekaton
• 2009/2010
Hekaton
becomes In-
Memory OLTP
• SQL Server 2014
In-Memory
OLTP
Unleashed
• SQL Server 2016
Keep SQL Server relevant in a world of high-speed hardware with dense
cores, fast I/O, and inexpensive massive memory
The need for high-speed OLTP transactions at microsecond speed
Reduce the number of instructions to execute a transaction
Find areas of latency for a transaction and reduce its footprint
Use multi-version optimistic concurrency and “lock free” algorithms
Use DLLs and native compiled procs
What, Why, and How
7
XTP = eXtreme Transaction Processing
HK = Hekaton
Minimal
diagnostics in
critical path
• Supported on Always On Availability Groups
• Supports ColumnStore Indexes for HTAP applications
• Supported in Azure SQL Database
• Cross container transactions (disk and in-memory in one transaction)
• Table variables supported
• SQL surface area expanded in SQL Server 2016. Complete support here
• LOB data types (ex. Varchar(max)) supported
• BACKUP/RESTORE complete functionality
• Use Transaction Performance Analysis Report for migration
In-Memory OLTP Capabilities
8
Preview
Hear the
case studies
“Normal” INSERT
The Path of In-Memory OLTP Transactions
9
Release latches and locks
Update index pages ( more locks and latch )
Modify page
Latch page
Obtain locks
INSERT LOG record
In-Memory OLTP INSERT
Maintain index in
memory
Insert ROW into
memory
COMMIT Transaction = Log
Record and Flush
COMMIT Transaction = Insert
HK Log Record and Flush
Page Split
Sys Tran =
Log Flush
Spinlocks
No index
logging
SCHEMA_ONLY
no logging
No latch
No spinlock
Just show it!
Demo
ἑκατόν
Architecture
and
Engine
Integration
The In-Memory OLTP Architecture
12
Runtime
Call Native Procs
Query Interop
Compile
Build Schema DLL
Build Native Proc DLL
The Engine
Allocate Memory
Checkpoint
Transaction Logging
System Tables
SQL Engine
Creating and executing
procs
Query Interop
SQL System Tables
Log Manager
HADR
In-Memory OLTP Engine Components
13
hkengine.dllhkcompile.dllhkruntime.dll
examples
below
SQLOS (sqldk.dll)
SCHEMA DLL and Native Proc DLL
Hekaton Kernel
filegroup <fg_name> contains memory_optimized_data
(name = <name>, filename = ‘<drive and path><folder name>’)
Creating the database for in-memory OLTP
14
filestream
container
we create this
• Name = DB_ID_<dbid>
• Your data goes here so this gets big
MEMORYCLERK_XTP
• Two of these for each db
• Much smaller but can grow some over time
MEMOBJ_XTPDB
HK System Tables
created
(hash index based
tables)
Create multiple
across drives
In-memory OLTP Threads
15
USER
GC
USER
USER
USER
USER
USER
Node 0
GC
Node <n>
user transactions
Garbage collection one
per node
database
CHECKPOINT
CONTROLLER
CHECKPOINT
CLOSE
LOG FLUSH
LOG FLUSH
thread
Data
Serializer
thread thread
In-Memory Thread Pools
CHECKPOINT I/O
U
U
P
XTP_OFFLINE_CKPT
H
H
P
U
H
P
P
U
U
H
P
user scheduler
hidden scheduler
preemptive
You could see
session_id >
50
Data
Serializer
dm_xtp_threads
Inside the HK
architecture
Demo
Data and
Indexes
Hash Index = single row lookup with ‘=‘
Range Index = searching for multi-rows
Creating a Table
create table starsoftheteam
(player_number int identity
primary key nonclustered hash with (bucket_count = 1048576),
player_name char(1000) not null)
with (memory_optimized = on, durability = schema_and_data)
All tables require
at least one index
Schema
stored in
SQL
system
catalog
Hash index
built in-
memory
Hk system
tables
populated
Schema
DLL built
Perform a
checkpoint
Ex. “root” table FS container files created
and written
Default = range
Inside Data and Rows
How do you find data rows in a normal SQL table?
• Heap = Use IAM pages
• Clustered Index = Find root page and traverse index
What about an in-memory table?
• Hash index table pointer known in HK metadata for a table. Hash the index key, go to bucket
pointer, traverse chain to find row
• Page Mapping Table used for range indexes and has a known pointer in HK metadata. Traverse the
range index which points to data rows
• Data exists in memory as pointers to rows (aka a heap). No page structure
All data rows have known header but data is opaque to HK engine
• Schema DLL and/or Native Compiled Proc DLL knows the format of the row data
• Schema DLL and/or Native Compiled Proc DLL knows how to find “key” inside the index
Compute your
estimated row size
here
“bag of
bytes”
Hash index
scan is
possible
A In-Memory OLTP Data Row Your data.
“bytes” to HK
engine
Points to another row in
the “chain”. One pointer for
each index on the table
Timestamp of
INSERT
Halloween
protection
Number of
pointers = #
indexes
Rows don’t
have to be
contiguous in
memory
Timestamp of
DELETE. ∞ =
“not deleted”
100,∞ Bob Dallas
Hash indexes
hash on name
columns = name char(100), city char (100)
hash index on name bucket_count = 8
hash index on city bucket_count = 8 hash on city
0 0
4
5
200,∞ Bob Fort Worth
90,∞ Ginger Austin
50,∞ Ryan Arlington
70,∞ Ginger Houston
3
4
5
chain count
= 3
Hash
collision
duplicate key !=
collision
Begin ts
reflects order
of inserts
ts
Hash Indexes
When should I use these?
Single row lookups by exact key value
Support multi-column indexes
Do I care about bucket counts and chain length?
The key is to keep the chains short
Key values with high cardinality the best and typically result in short chains
Larger number of buckets keep collisions low and chains shorter
Monitor and adjust with dm_db_xtp_hash_index_stats
SQL catalog view sys.hash_indexes also provided
Empty bucket % high ok except uses memory and can affect scans
Monitor average and max chain length (1 avg chain ideal; < 10 acceptable)
Rebuild with ALTER
TABLE. Could take
some time
REBUILD can block interop with SCH-S; Native proc rebuilt so blocked
Finding In-
Memory Data
Rows
Demo
Range Indexes
When should I use these?
Frequent multi-row searches based on a “range” criteria (Ex. >, < )
ORDER BY on index key
First column of multi-column index searches
“I don’t know” or “I don’t want to maintain buckets” – Default for an index
You can always put pkey on hash and range on other columns for seeks
What is different about them?
Contains a tree of pages but not a fixed size like SQL btrees
Pointers are logical IDs that refer to Page Mapping Table
Updates are contained in “delta” records instead of modifying page
Range Indexes A Bw-Tree. “Lock
and Latch” free
btree index
The SCHEMA DLL
• The HK Engine doesn’t understand row formats or key value comparisons
• So when a table is built or altered, we create, compile, and build a DLL
bound to the table used for interop queries
dbid
object_id
Symbol file
Rebuilt when
database comes
online or table altered
cl.exe
output
Generated C
source
t = schema DLL
p =native proc
The Row Payload create table letsgomavs
(col1 int not null identity primary key
nonclustered hash with (bucket_count =
1000000),
col2 int not null,
col3 varchar(100)
)
with (memory_optimized = on, durability
= schema_and_data)
struct NullBitsStruct_2147483654
{
unsigned char hkc_isnull_3:1;
};
struct hkt_2147483654
{
long hkc_1;
long hkc_2;
struct NullBitsStruct_2147483654 null_bits;
unsigned short hkvdo[2];
};
Header hkc_1 hkc_2
Null
bits
Offset
array
col3
“your data”
nullable
Access data with the SCHEMA DLL
Interop Code to “crack” rows
Find hash index in
metadata
Hash key and find
bucket
Call
CompareSKeyToRow
routine
If row found, “crack”
row
Lookup a row in a hash index seek
schema DLL
Find Table
metadata in
Root Table
Get
OpaqueData
pointer
Get
HkOffsetInfo
pointer
Find columns
and offsets
Use this to
“crack” row
values
Cracking
In-Memory
OLTP Data
Rows
Demo
Concurrency,
Transactions,
and Logging
Multi-Version Optimistic Concurrency (MVCC)
• Rows never change: UPDATE = DELETE + INSERT
Immutable
• UPDATE and DELETE create versions
• Timestamps in rows for visibility and transactions
correctness
Versions
• Assume no conflicts
• Snapshots + conflict detection
• Guarantee correct transaction isolation at commit
Optimistic
NOT in
tempdb
Pessimistic
= locks
Errors may occur
In-memory OLTP versioning
In-Mem TableDisk based table
1, ‘Row 1’
100
101
102
103
104
105
106
Time T1 T2
BEGIN TRAN
BEGIN TRAN
2, ‘Row 2’
SELECT
COMMIT
SELECT
COMMIT
100
101
102
103
104
105
106
Time T1 T2
BEGIN TRAN
BEGIN TRAN
2, ‘Row 2’
SELECT
COMMIT
SELECT
COMMIT
read committed and
SERIALIZABLE = blocked
XRCSI = Row 1
Snapshot = Row 1
RCSI = Row 1 and 2
Snapshot = Row 1
SERIALIZABLE =
Msg 41325
Snapshot always
used = Row 1
X
What were those timestamps for?
TS = 243
SELECT Name,
City FROM T1
“write set” = logged records
“read
set”
BEGIN TRAN TX3 – TS 246
SELECT City FROM T1 WITH
(REPEATABLEREAD) WHERE Name =
'Jane';
UPDATE T1 WITH (REPEATABLEREAD)
SET City ‘Helinski’ WHERE Name =
'Susan';
COMMIT TRAN -- commits at timestamp
255
Greg, Lisbon
Susan Bogata
Jane Helsinki
FAIL = ‘Jane’ changed after I started
but before I committed and I’m
REPEATABLEREAD. With SQL update to
Jane would have been blocked
Commit dependency
Transactions Phases and In-Memory OLTP
Update
conflict can
happen here
COMMIT
started
Ensure
isolation levels
are accurate.
Could result in
failure
Wait for
dependencies, flush
log records, update
timestamps
COMMIT
completed
Lookout for
Fkeys
dm_db_xtp_transactions
It is really “lock free”
Code executing transactions in the Hekaton Kernel are lock, latch, and spinlock free
Locks
• Only SCH-S needed for interop queries
• Database locks
Latches
• No pages = No page latches
• Latch XEvents don’t fire inside HK
Spinlocks
• Spinlock class never used
The “host” may wait – tLog, SQLOS
We use Thread
Local Storage (TLS)
to “guard” pointers
We can “walk” lists
lock free. Retries
may be needed
Atomic “Compare
and Swap” (CAS) to
modify
IsXTPSupported() = cmpxchg16b
Spinlocks use CAS
(‘cmpxchg’) to
“acquire” and
“release”
Great blog
post here
*XTP* wait
types not in
transaction
code
deadlock free
CMED_HASH_SET fix
In-Memory OLTP
Concurrency
and
Transactions
Demo
Logging for a simple disk based INSERT
use dontgobluejays
create table starsoftheteam (player_no int primary key nonclustered, player_name
char(100) not null)
One row already exists
begin tran
insert into starsoftheteam (1, 'I really don‘’t like the Blue Jays')
commit tran
LOP_BEGIN_XACT
LOP_INSERT_ROWS – heap page
LOP_INSERT_ROWS – ncl index
LOP_COMMIT_XACT
124
204
116
84
4 log
records
528 bytes
Logging for a simple in-mem OLTP INSERT
use gorangers
create table starsoftheteam (player_no int primary key nonclustered,
player_name char(100) not null)
with (memory_optimized = on, durability = schema_and_data)_
begin tran
insert into starsoftheteam values (1, 'Let'‘s go Rangers')
commit tran
3 log
records
436 bytes
LOP_BEGIN_XACT
LOP_HK
LOP_COMMIT_XACT
144
208
84
HK_LOP_BEGIN_TX
HK_LOP_INSERT_ROW
HK_LOP_COMMIT_TX
fn_dblog_xtp()
Multi-row INSERT disk-based heap table
LOP_BEGIN_XACT
LOP_INSERT_ROWS – heap page
LOP_INSERT_ROWS – ncl index
LOP_BEGIN_XACT
LOP_MODIFY_ROW – PFS
LOP_HOBT_DELTA
LOP_FORMAT_PAGE
LOP_COMMIT_XACT
LOP_SET_FREE_SPACE - PFS
LOP_COMMIT_XACT
100
rows
Log flush
PFS latch multiple times
213 log records @ 33Kb
PFS latch
Metadata access
1 pair for every row
Alloc new page twice
Compared to In-Memory OLTP
LOP_BEGIN_XACT
LOP_HK
LOP_COMMIT_XACT
144
11988
84
HK_LOP_BEGIN_TX
HK_LOP_INSERT_ROW
HK_LOP_COMMIT_TX
100
Only inserted into log cache at commit.
Log flush ROLLBACK =
more log records
for SQL. 0 for HK
If LOP_HK too big we
may need more than
one
Checkpoint
and
Recovery
In-Memory OLTP CHECKPOINT facts
Why CHECKPOINT?
Speed up database startup. Otherwise we would have to keep around a huge tlog around.
We only write data based on committed log records
Independent of SQL recovery intervals or background tasks (1.5Gb log increase)
All data written in pairs of data and delta files
Preemptive tasks using WriteFile (async with FILE_FLAG_NO_BUFFERING)
Data is always appended
I/O is continuous but CHECKPOINT event is based on 1.5Gb log growth
Instant File Initialization matters for PRECREATED files
Data = INSERTs and UPDATEs
Delta = filter for what rows in Data are actually deleted
Periodically we will MERGE several Data/Delta pairs.
Typically 128Mb
but can be 1Gb
Typically 8Mb but
can be 128Mb
Log truncation
eligible at
CHECKPOINT
event
No WAL protocol
CHECKPOINT FILE types and states PRECREATED
ACTIVE
UNDER CONSTRUCTION
WAITING..TRUNCATION
ROOT
FREE
After first CREATE TABLE
DELTA
DATA
ROOT
FREE
INSERT data rows
DELTA
DATA
ROOT
ROOT
CHECKPOINT event
DELTA
DATA
0 to 100 TS
0 to 100 TS
ROOT
ROOT
DELTA
DATA
INSERT more
data rows
DATA
DELTA
101 to 200 TS
101 to 200 TS
These will be
used to populate
table on startup
and then apply
log
Checkpoint
File Pair
(CFP) and
are what is
in tlog
This can be
reused or
deleted at log
truncation
We constantly keep
PRECREATED, FREE
files available
Any tran in this
range
FREEFREE
CHECKPOINT DATA file format
Block
Header
File Header
Block
Header
Data Rows
……
Block
Header
Data Rows
Fixed
“CKPT”
TYPE
4Kb
Length dependent
on TYPE
Row headers and
payload
256Kb
P
A
D
Every type starts with a
FILE Header block of 4Kb
ROOT and DELTA files use
4Kb blocks
Fixed
Find the list @
dm_db_xtp_checkpoint_files
In-Memory OLTP Recovery
Read details from
database Boot Page
(DBINFO)
Load ROOT file for
system tables and
file metadata
Load ACTIVE DATA
files filtered by
DELTA streamed in
parallel
Redo COMMITED
transactions greater
than last
CHECKPOINT LSN
Checksum failures =
RECOVERY_PENDING on
startup
We can
create a
thread per
CPU for this
We are a bit
ERRORLOG “happy”
CHECKPOINT
and
RECOVERY
Demo
Natively
Compiled
Procedures
Native = C code vs T-SQL
Built at CREATE PROC time or recompile
Rebuilt when database comes online
‘XTP Native DLL’ in dm_os_loaded_modules
The structure of a natively compiled proc
create procedure cowboys_proc_scan
with native_compilation, schemabinding as
begin atomic with
(transaction isolation level = snapshot, language = N'English')
select player_number, player_name from dbo.starsoftheteam
..
end
Compile this
Into a DLL
Required. No referenced
object/column
can be dropped or altered.
No SCH lock required
Everything in
block a single
tran
These are required. There are other options
No “*” allowed
Fully qualified
names
Bigger surface
area in SQL
Server 2016
Restrictions and
Best Practices
are here
Iso
levels
still
use
MVCC
Your queries
The lifecycle of compilation
Mixed Abstract
Tree (MAT) built
• Abstraction with flow
and SQL specific info
• Query trees into
MAT nodes
• XML file in the DLL
dir
Converted to
Pure Imperative
Tree (PIT)
• General nodes of a
tree that are SQL
agnostic. C like data
structures
• Easier to turn into
final C code
Final C code built
and written
• C instead of C++ is
simpler and faster
compile.
• No user defined
strings or SQL
identifiers to prevent
injection
Call cl.exe to
compile, link, and
generate DLL
• Many of what is
needed to execute in
the DLL
• Some calls into the
HK Engine
All files in
BINNXTP
VC has
compiler files,
DLLs and libs
Gen has HK
header and
libs
Call cl.exe to
compile and
link
Done in memory
xtp_matgen XEvent
Debugging a Natively
Compiled Proc
Demo
Resource
Management
Inside the Memory of In-Memory OLTP
Hekaton implements its own memory management system built on SQLOS
• MEMORYCLERK_XTP (DB_ID_<dbid>) uses SQLOS Page allocator
• Variable heaps created per table and range index
• Hash indexes using partitioned memory objects for buckets
• “System” memory for database independent tasks
• Memory only limited by the OS (24TB in Windows Server 2016)
• Details in dm_db_xtp_memory_consumers and dm_xtp_system_memory_consumers
In-Memory does recognize SQL Server Memory Pressure
• Garbage collection is triggered
• If OOM, no inserts allowed but you may be able to DELETE to free up space
Allocated at
create index
time
Locked and
Large apply
Resource Governor and In-Memory OLTP
Remember this is ALL memory
SQL Server limits HK to ~70-90% of target depending on size
Freeing up memory = drop db (immediate) or delete data/drop table (deferred)
Binding to your own Resource Pool
Best way to control and monitor memory usage of HK tables
sp_xtp_bind_db_resource_pool required to bind db to RG pool
What about CPU and I/O?
RG I/O does not apply because we don’t use SQLOS for checkpoint I/O
RG CPU does apply for user tasks because we run under SQLOS host
no classifier
function
Not the same
as memory
Let’s Wrap It Up
Walk away with this
 Want 30x faster for OLTP? Go In-Memory OLTP
 Transactions are truly lock, latch, and spinlock free
 Hash index for single row; Range for multi-row
 Reduce transaction latency for super speed
 SQL Server 2016 major step up from 2014
We are not done
• Move to GA for Azure SQL Database
• Increase surface area of features support for SQL Server
• Push the limits of hardware
• Speeding up recovery and HA with NVDIMM and RDMA
• Explore further areas of code optimization
• SQL 2016 CU2 fix for session state workloads Read more here
• Range index performance enhancements
Resources
• SQL Server In-Memory OLTP Internals for SQL Server 2016
• In-Memory OLTP Videos: What it is and When/How to use it
• Explore In-Memory OLTP architectures and customer case studies
• Review In-Memory OLTP in SQL Server 2016 and Azure SQL Database
• In-Memory OLTP (In-Memory Optimization) docs
Bonus
Material
Troubleshooting
Validation errors. Check out this blog
Upgrade from 2014 to 2016 can take time
Large checkpoint files for 2016 could take up more space and increase recovery times
Log growths, XTP_CHECKPOINT waits. Hotfix 6051103
Checkpoint file shut down and with no detailed info: https://support.microsoft.com/en-us/kb/3090141
Unable to rebuild log 6424109 (by design)
Set filegroup offline. NEVER do it because you can’t set it online again (filestream limitation)
You can’t remove HK filegroup after you add it 2016 LOB can cause memory issues
Other CSS Observations
• Hash index doesn’t have a concept of “covered index”
• Low memory can cause recovery to fail (because we need everything in memory)
• No dbcc checkdb support (but checkpoint files and backup have checksum)
From the CSS team
Architecture Pillars
60
The Challenge
61
Client Access
· Backup/
Restore
· DBCC
· Buik load
· Exception
handling
· Event logging
· XEvents
SOS
· Process model (scheduling & synchronization)
· Memory management & caching
· I/O
Txn’s
Lock
mgr
Buffer
Pool
Access Methods
Query Execution & Expression
Services
UCS
Query
Optimization
TSQL Parsing & Algebrizer
Metadata
File
Manager
Database
Manager
Procedure
Cache
· TDS handler
· Session Mgmt
Security
Support Utilities
UDTs &
CLR
Stored
Procs
SNI
Service
Broker
Logging
&
Recovery
10%10%
Network,
TDS
T-SQL
interpreter
Query
Execution
Expression
s
Access
Methods
Transactio
n, Lock,
Log
Managers
SOS, OS
and I/O
35%45%
SQLOS task and worker threads are the foundation
“User” tasks to run transactions
Hidden schedulers used for critical background tasks
Some tasks dedicated while others use a “worker” pool
The In-Memory OLTP Thread Model
62
SQLOS workers dedicated
to HK
NON-PREEMPTIVEPOOL
Background
workers needed
for on-demand
Parallel ALTER
TABLE
WORKERSPOOL
Other workers
needed for on-
demand (non-
preemptive)
Parallel MERGE
operation of
checkpoint files
PREEMPTIVEPOOL
Background
workers needed
on-demand -
preemptive
Serializers to
write out
checkpoint files
In-Memory OLTP Thread Pool
63
Command = XTP_THREAD_POOL wait_type =
DISPATCHER_QUEUE_SEMAPHORE
Command = UNKNOWN
TOKEN wait_type =
XTP_PREEMPTIVE_TASK
Ideal count = < # schedulers> ; idle timeout = 10 secs
HUU
H
U
hidden scheduler
user scheduler
Data Modifications
• Multi-version Optimistic Concurrency prevents all blocking
• ALL UPDATEs are DELETE followed by INSERT
• DELETED rows not automatically removed from memory
• Deleted rows not visible to active transactions becomes stale
• Garbage Collection process removes stale rows from memory
• TRUNCATE TABLE not supported
Page
deallocation in
SQL Server
Garbage Collection
Stale rows to be removed queued and removed by multiple workers
User workers can clean up stale rows as part of normal transactions
Dedicated threads (GC) per NODE to cleanup stale rows (awoken based on # deletes)
SQL Server Memory pressure can accelerate
Dusty corner scan cleanup – ones no transaction ever reaches
This is the only way to reduce size of HK tables
Long running transactions may prevent removal of deleted rows (visible != stale)
Diagnostics
rows_expired = rows that are stale
rows_expired_removed = stale rows removed from memory
Perfmon -SQL Server 2016 XTP Garbage Collection
dm_db_xtp_gc_cycle_stats
dm_xtp_gc_queue_stats
Diagnostics for Native Procs
Query Plans and Stats
SHOWPLAN_XML is only option
Operator list can be found here
sp_xtp_control_query_exec_stats and sp_xtp_control_proc_exec_stats
Query Store
Plan store by default
Runtime stats store needs above stored procedures to enable (need recompile)
XEvent and SQLTrace
sp_statement_completed works for Xevent but more limited information
SQLTrace only picks up overall proc execution at batch level
Read the
docs
Undo, tempdb, and BULK
• Undo required 415 log records @46Kb
• Hekaton required no logging
• Tempdb has similar # log records as disk-
based table but less per log record (aka
minimal logging).
Multi-row
insert test
Remember
SCHEMA_ONLY has no
logging or I/OBULK INSERT for
in-mem executes
and logged just
like INSERT
Minimally logged
BULK INSERT took
271 log records @
27Kb
Latches required for
GAM, PFS, and system
table pates
In-Memory OLTP pushes the log to the limit
Multiple Log Writers in SQL Server 2016
One per NODE up to four
We were able to increase log throughput from 600 to 900Mb/sec
You could go to delayed durability
Database is always consistent
You could lose transactions
Log at the speed of memory
Tail of the log is a “memcpy” to commit
Watch the video
The Merge Process
Over time we could have many CFPs
Why not consolidate data and delta files into a smaller number of pairs?
MERGE TARGET type is target of the merge of files. Becomes ACTIVE
ACTIVE files that were source of merge become WAITING FOR LOG TRUNCATION
Bob Ward
Principal Architect, Microsoft
Bob Ward is a Principal Architect for the Microsoft Data
Group (Tiger Team). Bob has worked for Microsoft for 23
years supporting and speaking on every version of SQL
Server shipped from OS/2 1.1 to SQL Server 2016. He has
worked in customer support as a principal escalation
engineer and Chief Technology Officer (CTO) interacting with
some of the largest SQL Server deployments in the world.
Bob is a well-known speaker on SQL Server often presenting
talks on internals and troubleshooting at events such as SQL
PASS Summit, SQLBits, SQLIntersection, and Microsoft Ignite.
You can find him on twitter at @bobwardms or read his blog
at https://blogs.msdn.microsoft.com/bobsql..
Place your
photo here
/bobwardms @bobwardms Bob Ward
Explore Everything PASS Has to Offer
FREE ONLINE WEBINAR EVENTS FREE 1-DAY LOCAL TRAINING EVENTS
LOCAL USER GROUPS
AROUND THE WORLD
ONLINE SPECIAL INTEREST
USER GROUPS
BUSINESS ANALYTICS TRAINING
VOLUNTEERING OPPORTUNITIES
PASS COMMUNITY NEWSLETTER
BA INSIGHTS NEWSLETTERFREE ONLINE RESOURCES
Session Evaluations
ways to access
Go to passSummit.com Download the GuideBook App
and search: PASS Summit 2016
Follow the QR code link displayed
on session signage throughout the
conference venue and in the
program guide
Submit by 5pm
Friday November 6th to
WIN prizes
Your feedback is
important and valuable. 3
Thank You
Learn more from
Bob Ward
bobward@company.com or follow @bobwardms

More Related Content

What's hot

Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Julian Hyde
 
ORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, SmallerORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, Smaller
DataWorks Summit
 
AWR and ASH Deep Dive
AWR and ASH Deep DiveAWR and ASH Deep Dive
AWR and ASH Deep Dive
Kellyn Pot'Vin-Gorman
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
Chris Baynes
 
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
Michael Stack
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
 
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And WhatPerformance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And Whatudaymoogala
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Markus Michalewicz
 
Oracle db performance tuning
Oracle db performance tuningOracle db performance tuning
Oracle db performance tuningSimon Huang
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWR
pasalapudi
 
PLPgSqL- Datatypes, Language structure.pptx
PLPgSqL- Datatypes, Language structure.pptxPLPgSqL- Datatypes, Language structure.pptx
PLPgSqL- Datatypes, Language structure.pptx
johnwick814916
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
Julian Hyde
 
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangExperiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Databricks
 
Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)
Julian Hyde
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
Tathastu.ai
 
All of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperAll of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL Developer
Jeff Smith
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Tanel Poder
 
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache CalciteCost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
Julian Hyde
 
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...
HostedbyConfluent
 

What's hot (20)

Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
 
ORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, SmallerORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, Smaller
 
AWR and ASH Deep Dive
AWR and ASH Deep DiveAWR and ASH Deep Dive
AWR and ASH Deep Dive
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
 
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
Using Statspack and AWR for Memory Monitoring and Tuning
Using Statspack and AWR for Memory Monitoring and TuningUsing Statspack and AWR for Memory Monitoring and Tuning
Using Statspack and AWR for Memory Monitoring and Tuning
 
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And WhatPerformance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
 
Oracle db performance tuning
Oracle db performance tuningOracle db performance tuning
Oracle db performance tuning
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWR
 
PLPgSqL- Datatypes, Language structure.pptx
PLPgSqL- Datatypes, Language structure.pptxPLPgSqL- Datatypes, Language structure.pptx
PLPgSqL- Datatypes, Language structure.pptx
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangExperiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
 
Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
All of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperAll of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL Developer
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contention
 
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache CalciteCost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
 
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...
 

Viewers also liked

SQL Server In-Memory OLTP: What Every SQL Professional Should Know
SQL Server In-Memory OLTP: What Every SQL Professional Should KnowSQL Server In-Memory OLTP: What Every SQL Professional Should Know
SQL Server In-Memory OLTP: What Every SQL Professional Should Know
Bob Ward
 
SQL Server R Services: What Every SQL Professional Should Know
SQL Server R Services: What Every SQL Professional Should KnowSQL Server R Services: What Every SQL Professional Should Know
SQL Server R Services: What Every SQL Professional Should Know
Bob Ward
 
Brk3288 sql server v.next with support on linux, windows and containers was...
Brk3288 sql server v.next with support on linux, windows and containers   was...Brk3288 sql server v.next with support on linux, windows and containers   was...
Brk3288 sql server v.next with support on linux, windows and containers was...
Bob Ward
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
Bob Ward
 
Brk3043 azure sql db intelligent cloud database for app developers - wash dc
Brk3043 azure sql db   intelligent cloud database for app developers - wash dcBrk3043 azure sql db   intelligent cloud database for app developers - wash dc
Brk3043 azure sql db intelligent cloud database for app developers - wash dc
Bob Ward
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dc
Bob Ward
 
Twenty is Plenty
Twenty is PlentyTwenty is Plenty
Twenty is Plenty
Bob Ward
 
Accounting concepts
Accounting conceptsAccounting concepts
Accounting concepts
Mohamed Etman
 
Actividad 2
Actividad 2Actividad 2
Actividad 5 (1)
Actividad 5 (1)Actividad 5 (1)
Actividad 5 (1)
Mary Tere Villanueva
 
indian standard for Hard-Drawn Copper Conductors for Over Head Power Transmis...
indian standard for Hard-Drawn Copper Conductors for Over Head Power Transmis...indian standard for Hard-Drawn Copper Conductors for Over Head Power Transmis...
indian standard for Hard-Drawn Copper Conductors for Over Head Power Transmis...
Chirag vasava
 
3. linux installation
3.  linux installation3.  linux installation
3. linux installation
kris harden
 
Privacidad, seguridad y protección online
Privacidad, seguridad y protección onlinePrivacidad, seguridad y protección online
Privacidad, seguridad y protección online
Patricia Antenucci
 
Cast/Actors
Cast/ActorsCast/Actors
Cast/Actors
AshleyC__
 
Ppt on self study
Ppt on self studyPpt on self study
Ppt on self study
Dr.Sanjeev Kumar
 
Agile vs. waterfall
Agile vs. waterfallAgile vs. waterfall
Agile vs. waterfall
Dvir Zohar
 
Otimizando a performance com in memory no sql 2016
Otimizando a performance com in memory no sql 2016Otimizando a performance com in memory no sql 2016
Otimizando a performance com in memory no sql 2016
Luiz Henrique Garetti Rosário
 
Copper conductor
Copper conductorCopper conductor
Copper conductor
Chirag vasava
 
Who's Reading Over Your Shoulder?
Who's Reading Over Your Shoulder? Who's Reading Over Your Shoulder?
Who's Reading Over Your Shoulder?
Hann9110
 

Viewers also liked (20)

SQL Server In-Memory OLTP: What Every SQL Professional Should Know
SQL Server In-Memory OLTP: What Every SQL Professional Should KnowSQL Server In-Memory OLTP: What Every SQL Professional Should Know
SQL Server In-Memory OLTP: What Every SQL Professional Should Know
 
SQL Server R Services: What Every SQL Professional Should Know
SQL Server R Services: What Every SQL Professional Should KnowSQL Server R Services: What Every SQL Professional Should Know
SQL Server R Services: What Every SQL Professional Should Know
 
Brk3288 sql server v.next with support on linux, windows and containers was...
Brk3288 sql server v.next with support on linux, windows and containers   was...Brk3288 sql server v.next with support on linux, windows and containers   was...
Brk3288 sql server v.next with support on linux, windows and containers was...
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
 
Brk3043 azure sql db intelligent cloud database for app developers - wash dc
Brk3043 azure sql db   intelligent cloud database for app developers - wash dcBrk3043 azure sql db   intelligent cloud database for app developers - wash dc
Brk3043 azure sql db intelligent cloud database for app developers - wash dc
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dc
 
Twenty is Plenty
Twenty is PlentyTwenty is Plenty
Twenty is Plenty
 
ckitterman resume
ckitterman resumeckitterman resume
ckitterman resume
 
Accounting concepts
Accounting conceptsAccounting concepts
Accounting concepts
 
Actividad 2
Actividad 2Actividad 2
Actividad 2
 
Actividad 5 (1)
Actividad 5 (1)Actividad 5 (1)
Actividad 5 (1)
 
indian standard for Hard-Drawn Copper Conductors for Over Head Power Transmis...
indian standard for Hard-Drawn Copper Conductors for Over Head Power Transmis...indian standard for Hard-Drawn Copper Conductors for Over Head Power Transmis...
indian standard for Hard-Drawn Copper Conductors for Over Head Power Transmis...
 
3. linux installation
3.  linux installation3.  linux installation
3. linux installation
 
Privacidad, seguridad y protección online
Privacidad, seguridad y protección onlinePrivacidad, seguridad y protección online
Privacidad, seguridad y protección online
 
Cast/Actors
Cast/ActorsCast/Actors
Cast/Actors
 
Ppt on self study
Ppt on self studyPpt on self study
Ppt on self study
 
Agile vs. waterfall
Agile vs. waterfallAgile vs. waterfall
Agile vs. waterfall
 
Otimizando a performance com in memory no sql 2016
Otimizando a performance com in memory no sql 2016Otimizando a performance com in memory no sql 2016
Otimizando a performance com in memory no sql 2016
 
Copper conductor
Copper conductorCopper conductor
Copper conductor
 
Who's Reading Over Your Shoulder?
Who's Reading Over Your Shoulder? Who's Reading Over Your Shoulder?
Who's Reading Over Your Shoulder?
 

Similar to Inside SQL Server In-Memory OLTP

Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017
Bob Ward
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
NoSQLmatters
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Michael Rys
 
Information Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampInformation Retrieval - Data Science Bootcamp
Information Retrieval - Data Science Bootcamp
Kais Hassan, PhD
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
Satoshi Nagayasu
 
JSSUG: SQL Sever Index Tuning
JSSUG: SQL Sever Index TuningJSSUG: SQL Sever Index Tuning
JSSUG: SQL Sever Index Tuning
Kenichiro Nakamura
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
guest9d79e073
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
Mark Ginnebaugh
 
Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Ravi Okade
 
Perl and Elasticsearch
Perl and ElasticsearchPerl and Elasticsearch
Perl and Elasticsearch
Dean Hamstead
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Michael Rys
 
Novedades SQL Server 2014
Novedades SQL Server 2014Novedades SQL Server 2014
Novedades SQL Server 2014
netmind
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
Andrew Lamb
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandrarantav
 
When to no sql and when to know sql javaone
When to no sql and when to know sql   javaoneWhen to no sql and when to know sql   javaone
When to no sql and when to know sql javaone
Simon Elliston Ball
 
Database Performance
Database PerformanceDatabase Performance
Database Performance
Boris Hristov
 
Performance By Design
Performance By DesignPerformance By Design
Performance By Design
Guy Harrison
 
Applyinga blockcentricapproach
Applyinga blockcentricapproachApplyinga blockcentricapproach
Applyinga blockcentricapproach
oracle documents
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Michael Rys
 
NoSQL Endgame DevoxxUA Conference 2020
NoSQL Endgame DevoxxUA Conference 2020NoSQL Endgame DevoxxUA Conference 2020
NoSQL Endgame DevoxxUA Conference 2020
Thodoris Bais
 

Similar to Inside SQL Server In-Memory OLTP (20)

Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
 
Information Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampInformation Retrieval - Data Science Bootcamp
Information Retrieval - Data Science Bootcamp
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
JSSUG: SQL Sever Index Tuning
JSSUG: SQL Sever Index TuningJSSUG: SQL Sever Index Tuning
JSSUG: SQL Sever Index Tuning
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)
 
Perl and Elasticsearch
Perl and ElasticsearchPerl and Elasticsearch
Perl and Elasticsearch
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
 
Novedades SQL Server 2014
Novedades SQL Server 2014Novedades SQL Server 2014
Novedades SQL Server 2014
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
When to no sql and when to know sql javaone
When to no sql and when to know sql   javaoneWhen to no sql and when to know sql   javaone
When to no sql and when to know sql javaone
 
Database Performance
Database PerformanceDatabase Performance
Database Performance
 
Performance By Design
Performance By DesignPerformance By Design
Performance By Design
 
Applyinga blockcentricapproach
Applyinga blockcentricapproachApplyinga blockcentricapproach
Applyinga blockcentricapproach
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
 
NoSQL Endgame DevoxxUA Conference 2020
NoSQL Endgame DevoxxUA Conference 2020NoSQL Endgame DevoxxUA Conference 2020
NoSQL Endgame DevoxxUA Conference 2020
 

More from Bob Ward

Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and docker
Bob Ward
 
Build new age applications on azures intelligent data platform
Build new age applications on azures intelligent data platformBuild new age applications on azures intelligent data platform
Build new age applications on azures intelligent data platform
Bob Ward
 
Brk2051 sql server on linux and docker
Brk2051 sql server on linux and dockerBrk2051 sql server on linux and docker
Brk2051 sql server on linux and docker
Bob Ward
 
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Bob Ward
 
Experience SQL Server 2017: The Modern Data Platform
Experience SQL Server 2017: The Modern Data PlatformExperience SQL Server 2017: The Modern Data Platform
Experience SQL Server 2017: The Modern Data Platform
Bob Ward
 
Sql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should knowSql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should know
Bob Ward
 
Keep your environment always on with sql server 2016 sql bits 2017
Keep your environment always on with sql server 2016 sql bits 2017Keep your environment always on with sql server 2016 sql bits 2017
Keep your environment always on with sql server 2016 sql bits 2017
Bob Ward
 
Enhancements that will make your sql database roar sp1 edition sql bits 2017
Enhancements that will make your sql database roar sp1 edition sql bits 2017Enhancements that will make your sql database roar sp1 edition sql bits 2017
Enhancements that will make your sql database roar sp1 edition sql bits 2017
Bob Ward
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
Bob Ward
 

More from Bob Ward (9)

Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and docker
 
Build new age applications on azures intelligent data platform
Build new age applications on azures intelligent data platformBuild new age applications on azures intelligent data platform
Build new age applications on azures intelligent data platform
 
Brk2051 sql server on linux and docker
Brk2051 sql server on linux and dockerBrk2051 sql server on linux and docker
Brk2051 sql server on linux and docker
 
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
Brk2045 upgrade sql server 2017 (on prem, iaa-s and paas)
 
Experience SQL Server 2017: The Modern Data Platform
Experience SQL Server 2017: The Modern Data PlatformExperience SQL Server 2017: The Modern Data Platform
Experience SQL Server 2017: The Modern Data Platform
 
Sql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should knowSql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should know
 
Keep your environment always on with sql server 2016 sql bits 2017
Keep your environment always on with sql server 2016 sql bits 2017Keep your environment always on with sql server 2016 sql bits 2017
Keep your environment always on with sql server 2016 sql bits 2017
 
Enhancements that will make your sql database roar sp1 edition sql bits 2017
Enhancements that will make your sql database roar sp1 edition sql bits 2017Enhancements that will make your sql database roar sp1 edition sql bits 2017
Enhancements that will make your sql database roar sp1 edition sql bits 2017
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
 

Recently uploaded

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
Jelle | Nordend
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
XfilesPro
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
varshanayak241
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 

Recently uploaded (20)

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 

Inside SQL Server In-Memory OLTP

  • 1. Inside SQL Server In-Memory OLTP Bob Ward, Principal Architect, Microsoft, Data Group, Tiger Team Latch free since 2007 Based on this whitepaper, internal docs, source code, and reverse engineering Credits to Kalen Delaney, Jos de Bruijn, Sunil Agarwal, Cristian Diaconu, Craig Freedman, Alejandro Hernandez Saenz, Jack Li, Mike Zwilling and the entire Hekaton team deck and demos at http://aka.ms/bobwardms
  • 2.
  • 3. This is a 500 Level Presentation It is ONE talk with a 15min break for 3 hours = Great primer and customer stories by @jdebruijn Get and run the demo yourself from github
  • 4. Let’s Do This What, How, Why, and Architecture Data and Indexes Concurrency, Transactions, and Logging Checkpoint and Recovery Natively Compiled Procedures Resource Management Based on SQL Server 2016 Check out the Bonus Material Section if time
  • 5. The Research behind the concept OLTP Through the Looking Glass, and What We Found There by Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, Michael Stonebraker , SIGMOD 2008 TPC-C hand-coded on top of the SHORE storage engine
  • 6. The Path to In-Memory OLTP Project Verde • 2007 “Early” Hekaton • 2009/2010 Hekaton becomes In- Memory OLTP • SQL Server 2014 In-Memory OLTP Unleashed • SQL Server 2016
  • 7. Keep SQL Server relevant in a world of high-speed hardware with dense cores, fast I/O, and inexpensive massive memory The need for high-speed OLTP transactions at microsecond speed Reduce the number of instructions to execute a transaction Find areas of latency for a transaction and reduce its footprint Use multi-version optimistic concurrency and “lock free” algorithms Use DLLs and native compiled procs What, Why, and How 7 XTP = eXtreme Transaction Processing HK = Hekaton Minimal diagnostics in critical path
  • 8. • Supported on Always On Availability Groups • Supports ColumnStore Indexes for HTAP applications • Supported in Azure SQL Database • Cross container transactions (disk and in-memory in one transaction) • Table variables supported • SQL surface area expanded in SQL Server 2016. Complete support here • LOB data types (ex. Varchar(max)) supported • BACKUP/RESTORE complete functionality • Use Transaction Performance Analysis Report for migration In-Memory OLTP Capabilities 8 Preview Hear the case studies
  • 9. “Normal” INSERT The Path of In-Memory OLTP Transactions 9 Release latches and locks Update index pages ( more locks and latch ) Modify page Latch page Obtain locks INSERT LOG record In-Memory OLTP INSERT Maintain index in memory Insert ROW into memory COMMIT Transaction = Log Record and Flush COMMIT Transaction = Insert HK Log Record and Flush Page Split Sys Tran = Log Flush Spinlocks No index logging SCHEMA_ONLY no logging No latch No spinlock
  • 12. The In-Memory OLTP Architecture 12
  • 13. Runtime Call Native Procs Query Interop Compile Build Schema DLL Build Native Proc DLL The Engine Allocate Memory Checkpoint Transaction Logging System Tables SQL Engine Creating and executing procs Query Interop SQL System Tables Log Manager HADR In-Memory OLTP Engine Components 13 hkengine.dllhkcompile.dllhkruntime.dll examples below SQLOS (sqldk.dll) SCHEMA DLL and Native Proc DLL Hekaton Kernel
  • 14. filegroup <fg_name> contains memory_optimized_data (name = <name>, filename = ‘<drive and path><folder name>’) Creating the database for in-memory OLTP 14 filestream container we create this • Name = DB_ID_<dbid> • Your data goes here so this gets big MEMORYCLERK_XTP • Two of these for each db • Much smaller but can grow some over time MEMOBJ_XTPDB HK System Tables created (hash index based tables) Create multiple across drives
  • 15. In-memory OLTP Threads 15 USER GC USER USER USER USER USER Node 0 GC Node <n> user transactions Garbage collection one per node database CHECKPOINT CONTROLLER CHECKPOINT CLOSE LOG FLUSH LOG FLUSH thread Data Serializer thread thread In-Memory Thread Pools CHECKPOINT I/O U U P XTP_OFFLINE_CKPT H H P U H P P U U H P user scheduler hidden scheduler preemptive You could see session_id > 50 Data Serializer dm_xtp_threads
  • 17. Data and Indexes Hash Index = single row lookup with ‘=‘ Range Index = searching for multi-rows
  • 18. Creating a Table create table starsoftheteam (player_number int identity primary key nonclustered hash with (bucket_count = 1048576), player_name char(1000) not null) with (memory_optimized = on, durability = schema_and_data) All tables require at least one index Schema stored in SQL system catalog Hash index built in- memory Hk system tables populated Schema DLL built Perform a checkpoint Ex. “root” table FS container files created and written Default = range
  • 19. Inside Data and Rows How do you find data rows in a normal SQL table? • Heap = Use IAM pages • Clustered Index = Find root page and traverse index What about an in-memory table? • Hash index table pointer known in HK metadata for a table. Hash the index key, go to bucket pointer, traverse chain to find row • Page Mapping Table used for range indexes and has a known pointer in HK metadata. Traverse the range index which points to data rows • Data exists in memory as pointers to rows (aka a heap). No page structure All data rows have known header but data is opaque to HK engine • Schema DLL and/or Native Compiled Proc DLL knows the format of the row data • Schema DLL and/or Native Compiled Proc DLL knows how to find “key” inside the index Compute your estimated row size here “bag of bytes” Hash index scan is possible
  • 20. A In-Memory OLTP Data Row Your data. “bytes” to HK engine Points to another row in the “chain”. One pointer for each index on the table Timestamp of INSERT Halloween protection Number of pointers = # indexes Rows don’t have to be contiguous in memory Timestamp of DELETE. ∞ = “not deleted”
  • 21. 100,∞ Bob Dallas Hash indexes hash on name columns = name char(100), city char (100) hash index on name bucket_count = 8 hash index on city bucket_count = 8 hash on city 0 0 4 5 200,∞ Bob Fort Worth 90,∞ Ginger Austin 50,∞ Ryan Arlington 70,∞ Ginger Houston 3 4 5 chain count = 3 Hash collision duplicate key != collision Begin ts reflects order of inserts ts
  • 22. Hash Indexes When should I use these? Single row lookups by exact key value Support multi-column indexes Do I care about bucket counts and chain length? The key is to keep the chains short Key values with high cardinality the best and typically result in short chains Larger number of buckets keep collisions low and chains shorter Monitor and adjust with dm_db_xtp_hash_index_stats SQL catalog view sys.hash_indexes also provided Empty bucket % high ok except uses memory and can affect scans Monitor average and max chain length (1 avg chain ideal; < 10 acceptable) Rebuild with ALTER TABLE. Could take some time REBUILD can block interop with SCH-S; Native proc rebuilt so blocked
  • 24. Range Indexes When should I use these? Frequent multi-row searches based on a “range” criteria (Ex. >, < ) ORDER BY on index key First column of multi-column index searches “I don’t know” or “I don’t want to maintain buckets” – Default for an index You can always put pkey on hash and range on other columns for seeks What is different about them? Contains a tree of pages but not a fixed size like SQL btrees Pointers are logical IDs that refer to Page Mapping Table Updates are contained in “delta” records instead of modifying page
  • 25. Range Indexes A Bw-Tree. “Lock and Latch” free btree index
  • 26. The SCHEMA DLL • The HK Engine doesn’t understand row formats or key value comparisons • So when a table is built or altered, we create, compile, and build a DLL bound to the table used for interop queries dbid object_id Symbol file Rebuilt when database comes online or table altered cl.exe output Generated C source t = schema DLL p =native proc
  • 27. The Row Payload create table letsgomavs (col1 int not null identity primary key nonclustered hash with (bucket_count = 1000000), col2 int not null, col3 varchar(100) ) with (memory_optimized = on, durability = schema_and_data) struct NullBitsStruct_2147483654 { unsigned char hkc_isnull_3:1; }; struct hkt_2147483654 { long hkc_1; long hkc_2; struct NullBitsStruct_2147483654 null_bits; unsigned short hkvdo[2]; }; Header hkc_1 hkc_2 Null bits Offset array col3 “your data” nullable
  • 28. Access data with the SCHEMA DLL Interop Code to “crack” rows Find hash index in metadata Hash key and find bucket Call CompareSKeyToRow routine If row found, “crack” row Lookup a row in a hash index seek schema DLL Find Table metadata in Root Table Get OpaqueData pointer Get HkOffsetInfo pointer Find columns and offsets Use this to “crack” row values
  • 31. Multi-Version Optimistic Concurrency (MVCC) • Rows never change: UPDATE = DELETE + INSERT Immutable • UPDATE and DELETE create versions • Timestamps in rows for visibility and transactions correctness Versions • Assume no conflicts • Snapshots + conflict detection • Guarantee correct transaction isolation at commit Optimistic NOT in tempdb Pessimistic = locks Errors may occur
  • 32. In-memory OLTP versioning In-Mem TableDisk based table 1, ‘Row 1’ 100 101 102 103 104 105 106 Time T1 T2 BEGIN TRAN BEGIN TRAN 2, ‘Row 2’ SELECT COMMIT SELECT COMMIT 100 101 102 103 104 105 106 Time T1 T2 BEGIN TRAN BEGIN TRAN 2, ‘Row 2’ SELECT COMMIT SELECT COMMIT read committed and SERIALIZABLE = blocked XRCSI = Row 1 Snapshot = Row 1 RCSI = Row 1 and 2 Snapshot = Row 1 SERIALIZABLE = Msg 41325 Snapshot always used = Row 1 X
  • 33. What were those timestamps for? TS = 243 SELECT Name, City FROM T1 “write set” = logged records “read set” BEGIN TRAN TX3 – TS 246 SELECT City FROM T1 WITH (REPEATABLEREAD) WHERE Name = 'Jane'; UPDATE T1 WITH (REPEATABLEREAD) SET City ‘Helinski’ WHERE Name = 'Susan'; COMMIT TRAN -- commits at timestamp 255 Greg, Lisbon Susan Bogata Jane Helsinki FAIL = ‘Jane’ changed after I started but before I committed and I’m REPEATABLEREAD. With SQL update to Jane would have been blocked Commit dependency
  • 34. Transactions Phases and In-Memory OLTP Update conflict can happen here COMMIT started Ensure isolation levels are accurate. Could result in failure Wait for dependencies, flush log records, update timestamps COMMIT completed Lookout for Fkeys dm_db_xtp_transactions
  • 35. It is really “lock free” Code executing transactions in the Hekaton Kernel are lock, latch, and spinlock free Locks • Only SCH-S needed for interop queries • Database locks Latches • No pages = No page latches • Latch XEvents don’t fire inside HK Spinlocks • Spinlock class never used The “host” may wait – tLog, SQLOS We use Thread Local Storage (TLS) to “guard” pointers We can “walk” lists lock free. Retries may be needed Atomic “Compare and Swap” (CAS) to modify IsXTPSupported() = cmpxchg16b Spinlocks use CAS (‘cmpxchg’) to “acquire” and “release” Great blog post here *XTP* wait types not in transaction code deadlock free CMED_HASH_SET fix
  • 37. Logging for a simple disk based INSERT use dontgobluejays create table starsoftheteam (player_no int primary key nonclustered, player_name char(100) not null) One row already exists begin tran insert into starsoftheteam (1, 'I really don‘’t like the Blue Jays') commit tran LOP_BEGIN_XACT LOP_INSERT_ROWS – heap page LOP_INSERT_ROWS – ncl index LOP_COMMIT_XACT 124 204 116 84 4 log records 528 bytes
  • 38. Logging for a simple in-mem OLTP INSERT use gorangers create table starsoftheteam (player_no int primary key nonclustered, player_name char(100) not null) with (memory_optimized = on, durability = schema_and_data)_ begin tran insert into starsoftheteam values (1, 'Let'‘s go Rangers') commit tran 3 log records 436 bytes LOP_BEGIN_XACT LOP_HK LOP_COMMIT_XACT 144 208 84 HK_LOP_BEGIN_TX HK_LOP_INSERT_ROW HK_LOP_COMMIT_TX fn_dblog_xtp()
  • 39. Multi-row INSERT disk-based heap table LOP_BEGIN_XACT LOP_INSERT_ROWS – heap page LOP_INSERT_ROWS – ncl index LOP_BEGIN_XACT LOP_MODIFY_ROW – PFS LOP_HOBT_DELTA LOP_FORMAT_PAGE LOP_COMMIT_XACT LOP_SET_FREE_SPACE - PFS LOP_COMMIT_XACT 100 rows Log flush PFS latch multiple times 213 log records @ 33Kb PFS latch Metadata access 1 pair for every row Alloc new page twice
  • 40. Compared to In-Memory OLTP LOP_BEGIN_XACT LOP_HK LOP_COMMIT_XACT 144 11988 84 HK_LOP_BEGIN_TX HK_LOP_INSERT_ROW HK_LOP_COMMIT_TX 100 Only inserted into log cache at commit. Log flush ROLLBACK = more log records for SQL. 0 for HK If LOP_HK too big we may need more than one
  • 42. In-Memory OLTP CHECKPOINT facts Why CHECKPOINT? Speed up database startup. Otherwise we would have to keep around a huge tlog around. We only write data based on committed log records Independent of SQL recovery intervals or background tasks (1.5Gb log increase) All data written in pairs of data and delta files Preemptive tasks using WriteFile (async with FILE_FLAG_NO_BUFFERING) Data is always appended I/O is continuous but CHECKPOINT event is based on 1.5Gb log growth Instant File Initialization matters for PRECREATED files Data = INSERTs and UPDATEs Delta = filter for what rows in Data are actually deleted Periodically we will MERGE several Data/Delta pairs. Typically 128Mb but can be 1Gb Typically 8Mb but can be 128Mb Log truncation eligible at CHECKPOINT event No WAL protocol
  • 43. CHECKPOINT FILE types and states PRECREATED ACTIVE UNDER CONSTRUCTION WAITING..TRUNCATION ROOT FREE After first CREATE TABLE DELTA DATA ROOT FREE INSERT data rows DELTA DATA ROOT ROOT CHECKPOINT event DELTA DATA 0 to 100 TS 0 to 100 TS ROOT ROOT DELTA DATA INSERT more data rows DATA DELTA 101 to 200 TS 101 to 200 TS These will be used to populate table on startup and then apply log Checkpoint File Pair (CFP) and are what is in tlog This can be reused or deleted at log truncation We constantly keep PRECREATED, FREE files available Any tran in this range FREEFREE
  • 44. CHECKPOINT DATA file format Block Header File Header Block Header Data Rows …… Block Header Data Rows Fixed “CKPT” TYPE 4Kb Length dependent on TYPE Row headers and payload 256Kb P A D Every type starts with a FILE Header block of 4Kb ROOT and DELTA files use 4Kb blocks Fixed Find the list @ dm_db_xtp_checkpoint_files
  • 45. In-Memory OLTP Recovery Read details from database Boot Page (DBINFO) Load ROOT file for system tables and file metadata Load ACTIVE DATA files filtered by DELTA streamed in parallel Redo COMMITED transactions greater than last CHECKPOINT LSN Checksum failures = RECOVERY_PENDING on startup We can create a thread per CPU for this We are a bit ERRORLOG “happy”
  • 47. Natively Compiled Procedures Native = C code vs T-SQL Built at CREATE PROC time or recompile Rebuilt when database comes online ‘XTP Native DLL’ in dm_os_loaded_modules
  • 48. The structure of a natively compiled proc create procedure cowboys_proc_scan with native_compilation, schemabinding as begin atomic with (transaction isolation level = snapshot, language = N'English') select player_number, player_name from dbo.starsoftheteam .. end Compile this Into a DLL Required. No referenced object/column can be dropped or altered. No SCH lock required Everything in block a single tran These are required. There are other options No “*” allowed Fully qualified names Bigger surface area in SQL Server 2016 Restrictions and Best Practices are here Iso levels still use MVCC Your queries
  • 49. The lifecycle of compilation Mixed Abstract Tree (MAT) built • Abstraction with flow and SQL specific info • Query trees into MAT nodes • XML file in the DLL dir Converted to Pure Imperative Tree (PIT) • General nodes of a tree that are SQL agnostic. C like data structures • Easier to turn into final C code Final C code built and written • C instead of C++ is simpler and faster compile. • No user defined strings or SQL identifiers to prevent injection Call cl.exe to compile, link, and generate DLL • Many of what is needed to execute in the DLL • Some calls into the HK Engine All files in BINNXTP VC has compiler files, DLLs and libs Gen has HK header and libs Call cl.exe to compile and link Done in memory xtp_matgen XEvent
  • 52. Inside the Memory of In-Memory OLTP Hekaton implements its own memory management system built on SQLOS • MEMORYCLERK_XTP (DB_ID_<dbid>) uses SQLOS Page allocator • Variable heaps created per table and range index • Hash indexes using partitioned memory objects for buckets • “System” memory for database independent tasks • Memory only limited by the OS (24TB in Windows Server 2016) • Details in dm_db_xtp_memory_consumers and dm_xtp_system_memory_consumers In-Memory does recognize SQL Server Memory Pressure • Garbage collection is triggered • If OOM, no inserts allowed but you may be able to DELETE to free up space Allocated at create index time Locked and Large apply
  • 53. Resource Governor and In-Memory OLTP Remember this is ALL memory SQL Server limits HK to ~70-90% of target depending on size Freeing up memory = drop db (immediate) or delete data/drop table (deferred) Binding to your own Resource Pool Best way to control and monitor memory usage of HK tables sp_xtp_bind_db_resource_pool required to bind db to RG pool What about CPU and I/O? RG I/O does not apply because we don’t use SQLOS for checkpoint I/O RG CPU does apply for user tasks because we run under SQLOS host no classifier function Not the same as memory
  • 55. Walk away with this  Want 30x faster for OLTP? Go In-Memory OLTP  Transactions are truly lock, latch, and spinlock free  Hash index for single row; Range for multi-row  Reduce transaction latency for super speed  SQL Server 2016 major step up from 2014
  • 56. We are not done • Move to GA for Azure SQL Database • Increase surface area of features support for SQL Server • Push the limits of hardware • Speeding up recovery and HA with NVDIMM and RDMA • Explore further areas of code optimization • SQL 2016 CU2 fix for session state workloads Read more here • Range index performance enhancements
  • 57. Resources • SQL Server In-Memory OLTP Internals for SQL Server 2016 • In-Memory OLTP Videos: What it is and When/How to use it • Explore In-Memory OLTP architectures and customer case studies • Review In-Memory OLTP in SQL Server 2016 and Azure SQL Database • In-Memory OLTP (In-Memory Optimization) docs
  • 59. Troubleshooting Validation errors. Check out this blog Upgrade from 2014 to 2016 can take time Large checkpoint files for 2016 could take up more space and increase recovery times Log growths, XTP_CHECKPOINT waits. Hotfix 6051103 Checkpoint file shut down and with no detailed info: https://support.microsoft.com/en-us/kb/3090141 Unable to rebuild log 6424109 (by design) Set filegroup offline. NEVER do it because you can’t set it online again (filestream limitation) You can’t remove HK filegroup after you add it 2016 LOB can cause memory issues Other CSS Observations • Hash index doesn’t have a concept of “covered index” • Low memory can cause recovery to fail (because we need everything in memory) • No dbcc checkdb support (but checkpoint files and backup have checksum) From the CSS team
  • 61. The Challenge 61 Client Access · Backup/ Restore · DBCC · Buik load · Exception handling · Event logging · XEvents SOS · Process model (scheduling & synchronization) · Memory management & caching · I/O Txn’s Lock mgr Buffer Pool Access Methods Query Execution & Expression Services UCS Query Optimization TSQL Parsing & Algebrizer Metadata File Manager Database Manager Procedure Cache · TDS handler · Session Mgmt Security Support Utilities UDTs & CLR Stored Procs SNI Service Broker Logging & Recovery 10%10% Network, TDS T-SQL interpreter Query Execution Expression s Access Methods Transactio n, Lock, Log Managers SOS, OS and I/O 35%45%
  • 62. SQLOS task and worker threads are the foundation “User” tasks to run transactions Hidden schedulers used for critical background tasks Some tasks dedicated while others use a “worker” pool The In-Memory OLTP Thread Model 62 SQLOS workers dedicated to HK
  • 63. NON-PREEMPTIVEPOOL Background workers needed for on-demand Parallel ALTER TABLE WORKERSPOOL Other workers needed for on- demand (non- preemptive) Parallel MERGE operation of checkpoint files PREEMPTIVEPOOL Background workers needed on-demand - preemptive Serializers to write out checkpoint files In-Memory OLTP Thread Pool 63 Command = XTP_THREAD_POOL wait_type = DISPATCHER_QUEUE_SEMAPHORE Command = UNKNOWN TOKEN wait_type = XTP_PREEMPTIVE_TASK Ideal count = < # schedulers> ; idle timeout = 10 secs HUU H U hidden scheduler user scheduler
  • 64. Data Modifications • Multi-version Optimistic Concurrency prevents all blocking • ALL UPDATEs are DELETE followed by INSERT • DELETED rows not automatically removed from memory • Deleted rows not visible to active transactions becomes stale • Garbage Collection process removes stale rows from memory • TRUNCATE TABLE not supported Page deallocation in SQL Server
  • 65. Garbage Collection Stale rows to be removed queued and removed by multiple workers User workers can clean up stale rows as part of normal transactions Dedicated threads (GC) per NODE to cleanup stale rows (awoken based on # deletes) SQL Server Memory pressure can accelerate Dusty corner scan cleanup – ones no transaction ever reaches This is the only way to reduce size of HK tables Long running transactions may prevent removal of deleted rows (visible != stale) Diagnostics rows_expired = rows that are stale rows_expired_removed = stale rows removed from memory Perfmon -SQL Server 2016 XTP Garbage Collection dm_db_xtp_gc_cycle_stats dm_xtp_gc_queue_stats
  • 66. Diagnostics for Native Procs Query Plans and Stats SHOWPLAN_XML is only option Operator list can be found here sp_xtp_control_query_exec_stats and sp_xtp_control_proc_exec_stats Query Store Plan store by default Runtime stats store needs above stored procedures to enable (need recompile) XEvent and SQLTrace sp_statement_completed works for Xevent but more limited information SQLTrace only picks up overall proc execution at batch level Read the docs
  • 67. Undo, tempdb, and BULK • Undo required 415 log records @46Kb • Hekaton required no logging • Tempdb has similar # log records as disk- based table but less per log record (aka minimal logging). Multi-row insert test Remember SCHEMA_ONLY has no logging or I/OBULK INSERT for in-mem executes and logged just like INSERT Minimally logged BULK INSERT took 271 log records @ 27Kb Latches required for GAM, PFS, and system table pates
  • 68. In-Memory OLTP pushes the log to the limit Multiple Log Writers in SQL Server 2016 One per NODE up to four We were able to increase log throughput from 600 to 900Mb/sec You could go to delayed durability Database is always consistent You could lose transactions Log at the speed of memory Tail of the log is a “memcpy” to commit Watch the video
  • 69. The Merge Process Over time we could have many CFPs Why not consolidate data and delta files into a smaller number of pairs? MERGE TARGET type is target of the merge of files. Becomes ACTIVE ACTIVE files that were source of merge become WAITING FOR LOG TRUNCATION
  • 70. Bob Ward Principal Architect, Microsoft Bob Ward is a Principal Architect for the Microsoft Data Group (Tiger Team). Bob has worked for Microsoft for 23 years supporting and speaking on every version of SQL Server shipped from OS/2 1.1 to SQL Server 2016. He has worked in customer support as a principal escalation engineer and Chief Technology Officer (CTO) interacting with some of the largest SQL Server deployments in the world. Bob is a well-known speaker on SQL Server often presenting talks on internals and troubleshooting at events such as SQL PASS Summit, SQLBits, SQLIntersection, and Microsoft Ignite. You can find him on twitter at @bobwardms or read his blog at https://blogs.msdn.microsoft.com/bobsql.. Place your photo here /bobwardms @bobwardms Bob Ward
  • 71. Explore Everything PASS Has to Offer FREE ONLINE WEBINAR EVENTS FREE 1-DAY LOCAL TRAINING EVENTS LOCAL USER GROUPS AROUND THE WORLD ONLINE SPECIAL INTEREST USER GROUPS BUSINESS ANALYTICS TRAINING VOLUNTEERING OPPORTUNITIES PASS COMMUNITY NEWSLETTER BA INSIGHTS NEWSLETTERFREE ONLINE RESOURCES
  • 72. Session Evaluations ways to access Go to passSummit.com Download the GuideBook App and search: PASS Summit 2016 Follow the QR code link displayed on session signage throughout the conference venue and in the program guide Submit by 5pm Friday November 6th to WIN prizes Your feedback is important and valuable. 3
  • 73. Thank You Learn more from Bob Ward bobward@company.com or follow @bobwardms

Editor's Notes

  1. The point of this slide is talk about capabilities that this deck will not cover
  2. Notice that log records are not even created until COMMIT.
  3. That is the greek word for Hekaton which means 100 because our original goal was to achieve 100x performance with In-Memory OLTP Follow the readme.txt file in the inmem_oltp_transaction_compare folder
  4. A request with command = XTP_CKPT_AGENT is used to keep FREE files replenished for CHECKPOINT files
  5. Follow the readme.txt file in the inmem_oltp_architecture folder
  6. Hash collision on name index with Ryan because it is a different value than Bob. The first two Bob rows are technically not a collision because they are the same key value
  7. Follow the readme.txt file in the inmem_oltp_data_and_indexes folder
  8. We will discuss the compilation process when talking about natively complied DLLs
  9. Follow the instructions in the readme.txt file in the inmem_oltp_schema_dll folder
  10. T-SQL IsXTPSupported() = Windows API IsProcessorFeaturePresent(PF_COMPARE_EXCHANGE128) = _InterlockedCompareExchange128() = Cmpxchg16b instruction
  11. Follow the instructions in the readme.txt file in the inmem_oltp_concurrency folder
  12. We can switch to “large” files (1Gb) if checkpoint has determine it can read 200Mb/sec and the machine has 16 cores and 128Gb for SQL Server
  13. Follow the instructions in readme.txt in the inmem_oltp_checkpoint folder
  14. Show the XML representing the MAT TREE Finding g_Bindings for a native proc DLL and set a breakpoint then step through it
  15. Probably cut slide
  16. Each pool is the size of number of schedulers after first item queued Timeout after 10 secs of idle NON-PREEMPTIVE and WORKERS POOL Command = XTP_THREAD_POOL Wait type = DISPATCHER_QUEUE_SEMAPHORE when idle PREEMPTIVE POOL Command = UNKNOWN TOKEN Wait type = XTP_PREEMPTIVE_TASK