Microsoft Hekaton

HEKATON
SQL Server’s Memory-Optimized OLTP Engine
Presented by: Prutha Date and Siraj Memon

Outline
● Introduction
● Design Consideration
● High-Level Architecture
● Storage and Indexing
● Programmability and Query Processing
● Transaction Management and Logging
● Garbage Collection
● Experimental Results
● Conclusion
● Demo

Introduction
● Database Engine: Optimized for Memory-resident data
● Targeted for OLTP workloads
● Integrated into SQL Server and uses T-SQL
● Fully transactional and durable
● Tables - Compiled into machine code
● Two Index Types: Hash Index and Range Index
● High-level of concurrency
OLTP (Online Transaction Processing)
T-SQL (Transact - Structured Query Language)

Terminology
● Hekaton Table
● Hekaton Index
● Regular Table
● Regular Index
● Compiled Stored Procedure
● Interpreted Stored Procedure

Competitors
● Commercial
● VoltDB
● SAP in-memory computing
● Oracle TimesTen
● IBM SolidDB
● Research
● Hyrise
● H-store
● HyPer

Architectural Principles
● Optimize Indexes for main memory
● Uses lock-free hash tables and Bw-trees for optimized indexing
● Index operations not logged
● Rebuilding indexes during recovery
● Eliminate Latches and Lock
● Latch-free data structure – No latches or spinlocks
● Optimistic Multi-version concurrency control – transaction isolation
● Compile requests to native code
● Decisions: Compile time rather than Runtime
● Converts statements in T-SQL into customized, highly-efficient machine
code

Partitioning – We don’t like..
● Problem with Partitioning
● Secondary Indexes
● Works great ONLY if workload is also partitionable
● Not sufficiently robust for SQL server
● Any thread can access any part of the database
● Single Shared hash table

High Level Architecture
● Hekaton Storage Engine
● Manages user data and indexes
● Base mechanism for storage, check-pointing and high-availability
● Hekaton Compiler
● Abstract tree representation of T-SQL stored procedure
● Compiles the procedure into native code
● Hekaton Runtime System
● Integration with SQL Server resources
● Common library of additional functionality

Storage and Indexing
● Two types of Index
● Hash Index: Lock-free hash tables
● Range Index: Bw-trees
● Use of Multiversioning – Updates create new version
● Reads:
● Read operation specifies a logical read time and only versions whose valid
time overlaps the read time are visible to the read
● At most one version is visible
● Updates:
● Delete Old - Insert New

Storage and Indexing (continued)

Architecture of Hekaton Compiler

Programmability and Query Processing
● Compile-once Execute-many-times
● High level of language compatibility
● Reuse of SQL Server T-SQL compilation stack
● Output of Hekaton compiler is C code
● Invoking the compiler:
● During creation of a memory optimized table
● During creation of a compiled stored procedure

Schema Compilation
● Hekaton storage engine treats records as opaque objects
● Hekaton compiler provides the engine with customized callback
functions for each table
● Task of Callback functions
● Computing a hash function on a key or record
● Comparing two records
● Serializing a record into a log buffer
● Callback functions are compiled into Native code which makes index
operations extremely efficient

Compiled Stored Procedure
● Compatibility issues between T-SQL and C datatypes
● Problem Solver:
● MAT (Mixed Abstract Tree)
● PIT (Pure Imperative Tree)
● Each operator implements a common interface so that they can be
composed into arbitrarily complex plans
● Entire Query plan into a single function using labels and gotos
● Supports both blocking and non-blocking operators

Example
Fig.1: Sample T-SQL Procedure Fig.2: Query Plan
Fig.3: Operator interconnections for Sample Procedure

Query Interop
● Restrictions of Compiled Stored Procedures
● Supports limited set of options
● Stored procedures must execute in a predefined security context
● Must execute in the context of a single transaction
● Ad-hoc mechanism that enables conventional query execution engine
to access memory optimized tables
● Features
● Import and Export for memory optimized tables
● Ad-hoc queries and data repair support
● Support for transactions that access both kind of tables
● Ease of app migration

Transaction Management
● Hekaton utilizes optimistic multiversion concurrency control (MVCC)
to provide snapshot, repeatable read and serializable transaction
isolation without locking
● Serializable – guarantee that transaction will see exactly the same
data if all its reads were repeated at the end of the transaction
● Properties to ensure serializability:
● Read stability
● Phantom avoidance
● Timestamps are used to specify
● Valid Time
● Logical Read Time
● Commit/End time
● Version visible if Begin Time < Read Time < Execution Time

Transaction Commit Processing
● Validation and Dependencies
● Obtain End timestamp
● Validate for Read Stability and Phantom Avoidance
● Commit Dependency
● Dependency counter
● Read barrier
● Commit Logging and Post-Processing
● Changes to database are logged to transaction log
● Update versions with end timestamp of transactions
● Transaction Rollback
● Invalidate all versions created by the transaction using Write Sets.

Transaction Durability
● Uses transaction logs and checkpoints to ensure durability
● Integrated with Always-On component that maintains highly available
replicas
● Data on external storage consists of –
● Log streams (Logical effects of committed transactions to redo it)
● Checkpoint streams (Compressed representation of the log)
● Data Stream (all inserted versions during a timestamp interval)
● Delta Stream (a dense list of integers identifying deleted versions for its
corresponding data stream)
● Note: Index operations are not logged; They are reconstructed on
recovery.

Transaction Logging and Checkpoints
● Transaction Logging
● One transaction – one log file
● Does not use WAL (Write-ahead logging)
● Uses a single log stream per database
● Checkpoints
● Continuous Checkpointing
● Streaming I/O
● Checkpoint Files and Checkpoint Process
● Recovery
● Parallelism within Hekaton
● Parallelism between SQL Server and Hekaton

Garbage Collection
● Version of a record is garbage if it is no longer visible to any active
transaction
● Properties of GC subsystem: Non-blocking, co-operative, incremental,
parallelizable and scalable
● Garbage Correctness
● Version whose end timestamp < Oldest active transaction is not
visible
● Version becomes garbage if -
●Deleted (Explicit DELETE or through UPDATE)
●Cannot be read or acted upon by any active transaction
●Transaction Rollback
● Garbage Removal
● Unlink from indexes
● Reclaim the version

Experimental Results - CPU Efficiency
Comparison of CPU efficiency for lookups Comparison of CPU efficiency for updates

Experimental Results - Scaling Under
Contention
• Experiment illustrating scalability of Hekaton engine

Conclusion
● Optimized in-memory OLTP workloads oriented database engine by
Microsoft
● Fully integrated with SQL Server
● Uses latch-free data structures, multi-versioning concurrency control,
compiled T-SQL stored procedure
● Ensure durability by logging and checkpointing
● High availability – SQL Server’s Always-On feature
● Order of magnitude improvement in efficiency and scalability with
minimal changes to user applications.

References
● http://vldb.org/pvldb/vol5/p298_per-akelarson_vldb2012.pdf
● http://nms.csail.mit.edu/~stavros/pubs/OLTP_sigmod08.pdf
● http://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/edbt09shor
emt.pdf
● http://research.microsoft.com/pubs/178758/bw-tree-icde2013-final.pdf
● https://voltdb.com/
● http://llvm.org/
● http://www.oracle.com/technetwork/database/database-
technologies/timesten/overview/index.html

Microsoft Hekaton

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Viewers also liked

Viewers also liked (11)

Similar to Microsoft Hekaton

Similar to Microsoft Hekaton (20)

Recently uploaded

Recently uploaded (20)

Microsoft Hekaton