SlideShare a Scribd company logo
HEKATON
SQL Server’s Memory-Optimized OLTP Engine
Presented by: Prutha Date and Siraj Memon
Outline
● Introduction
● Design Consideration
● High-Level Architecture
● Storage and Indexing
● Programmability and Query Processing
● Transaction Management and Logging
● Garbage Collection
● Experimental Results
● Conclusion
● Demo
Introduction
● Database Engine: Optimized for Memory-resident data
● Targeted for OLTP workloads
● Integrated into SQL Server and uses T-SQL
● Fully transactional and durable
● Tables - Compiled into machine code
● Two Index Types: Hash Index and Range Index
● High-level of concurrency
OLTP (Online Transaction Processing)
T-SQL (Transact - Structured Query Language)
Terminology
● Hekaton Table
● Hekaton Index
● Regular Table
● Regular Index
● Compiled Stored Procedure
● Interpreted Stored Procedure
Competitors
● Commercial
● VoltDB
● SAP in-memory computing
● Oracle TimesTen
● IBM SolidDB
● Research
● Hyrise
● H-store
● HyPer
Architectural Principles
● Optimize Indexes for main memory
● Uses lock-free hash tables and Bw-trees for optimized indexing
● Index operations not logged
● Rebuilding indexes during recovery
● Eliminate Latches and Lock
● Latch-free data structure – No latches or spinlocks
● Optimistic Multi-version concurrency control – transaction isolation
● Compile requests to native code
● Decisions: Compile time rather than Runtime
● Converts statements in T-SQL into customized, highly-efficient machine
code
Partitioning – We don’t like..
● Problem with Partitioning
● Secondary Indexes
● Works great ONLY if workload is also partitionable
● Not sufficiently robust for SQL server
● Any thread can access any part of the database
● Single Shared hash table
High Level Architecture
● Hekaton Storage Engine
● Manages user data and indexes
● Base mechanism for storage, check-pointing and high-availability
● Hekaton Compiler
● Abstract tree representation of T-SQL stored procedure
● Compiles the procedure into native code
● Hekaton Runtime System
● Integration with SQL Server resources
● Common library of additional functionality
Hekaton and SQL Server
Storage and Indexing
● Two types of Index
● Hash Index: Lock-free hash tables
● Range Index: Bw-trees
● Use of Multiversioning – Updates create new version
● Reads:
● Read operation specifies a logical read time and only versions whose valid
time overlaps the read time are visible to the read
● At most one version is visible
● Updates:
● Delete Old - Insert New
Storage and Indexing (continued)
Architecture of Hekaton Compiler
Programmability and Query Processing
● Compile-once Execute-many-times
● High level of language compatibility
● Reuse of SQL Server T-SQL compilation stack
● Output of Hekaton compiler is C code
● Invoking the compiler:
● During creation of a memory optimized table
● During creation of a compiled stored procedure
Schema Compilation
● Hekaton storage engine treats records as opaque objects
● Hekaton compiler provides the engine with customized callback
functions for each table
● Task of Callback functions
● Computing a hash function on a key or record
● Comparing two records
● Serializing a record into a log buffer
● Callback functions are compiled into Native code which makes index
operations extremely efficient
Compiled Stored Procedure
● Compatibility issues between T-SQL and C datatypes
● Problem Solver:
● MAT (Mixed Abstract Tree)
● PIT (Pure Imperative Tree)
● Each operator implements a common interface so that they can be
composed into arbitrarily complex plans
● Entire Query plan into a single function using labels and gotos
● Supports both blocking and non-blocking operators
Example
Fig.1: Sample T-SQL Procedure Fig.2: Query Plan
Fig.3: Operator interconnections for Sample Procedure
Query Interop
● Restrictions of Compiled Stored Procedures
● Supports limited set of options
● Stored procedures must execute in a predefined security context
● Must execute in the context of a single transaction
● Ad-hoc mechanism that enables conventional query execution engine
to access memory optimized tables
● Features
● Import and Export for memory optimized tables
● Ad-hoc queries and data repair support
● Support for transactions that access both kind of tables
● Ease of app migration
Transaction Management
● Hekaton utilizes optimistic multiversion concurrency control (MVCC)
to provide snapshot, repeatable read and serializable transaction
isolation without locking
● Serializable – guarantee that transaction will see exactly the same
data if all its reads were repeated at the end of the transaction
● Properties to ensure serializability:
● Read stability
● Phantom avoidance
● Timestamps are used to specify
● Valid Time
● Logical Read Time
● Commit/End time
● Version visible if Begin Time < Read Time < Execution Time
Transaction Commit Processing
● Validation and Dependencies
● Obtain End timestamp
● Validate for Read Stability and Phantom Avoidance
● Commit Dependency
● Dependency counter
● Read barrier
● Commit Logging and Post-Processing
● Changes to database are logged to transaction log
● Update versions with end timestamp of transactions
● Transaction Rollback
● Invalidate all versions created by the transaction using Write Sets.
Transaction Durability
● Uses transaction logs and checkpoints to ensure durability
● Integrated with Always-On component that maintains highly available
replicas
● Data on external storage consists of –
● Log streams (Logical effects of committed transactions to redo it)
● Checkpoint streams (Compressed representation of the log)
● Data Stream (all inserted versions during a timestamp interval)
● Delta Stream (a dense list of integers identifying deleted versions for its
corresponding data stream)
● Note: Index operations are not logged; They are reconstructed on
recovery.
Transaction Logging and Checkpoints
● Transaction Logging
● One transaction – one log file
● Does not use WAL (Write-ahead logging)
● Uses a single log stream per database
● Checkpoints
● Continuous Checkpointing
● Streaming I/O
● Checkpoint Files and Checkpoint Process
● Recovery
● Parallelism within Hekaton
● Parallelism between SQL Server and Hekaton
Garbage Collection
● Version of a record is garbage if it is no longer visible to any active
transaction
● Properties of GC subsystem: Non-blocking, co-operative, incremental,
parallelizable and scalable
● Garbage Correctness
● Version whose end timestamp < Oldest active transaction is not
visible
● Version becomes garbage if -
●Deleted (Explicit DELETE or through UPDATE)
●Cannot be read or acted upon by any active transaction
●Transaction Rollback
● Garbage Removal
● Unlink from indexes
● Reclaim the version
Experimental Results - CPU Efficiency
Comparison of CPU efficiency for lookups Comparison of CPU efficiency for updates
Experimental Results - Scaling Under
Contention
• Experiment illustrating scalability of Hekaton engine
Conclusion
● Optimized in-memory OLTP workloads oriented database engine by
Microsoft
● Fully integrated with SQL Server
● Uses latch-free data structures, multi-versioning concurrency control,
compiled T-SQL stored procedure
● Ensure durability by logging and checkpointing
● High availability – SQL Server’s Always-On feature
● Order of magnitude improvement in efficiency and scalability with
minimal changes to user applications.
References
● http://vldb.org/pvldb/vol5/p298_per-akelarson_vldb2012.pdf
● http://nms.csail.mit.edu/~stavros/pubs/OLTP_sigmod08.pdf
● http://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/edbt09shor
emt.pdf
● http://research.microsoft.com/pubs/178758/bw-tree-icde2013-final.pdf
● https://voltdb.com/
● http://llvm.org/
● http://www.oracle.com/technetwork/database/database-
technologies/timesten/overview/index.html
Demo
THANK YOU
Questions??

More Related Content

Viewers also liked

MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB Internals
Siraj Memon
 
Kinect
Kinect Kinect
Kinect
KinectKinect
Kinect
Shoaib Khan
 
Introduction to Kinect - Update v 1.8
Introduction to Kinect - Update v 1.8Introduction to Kinect - Update v 1.8
Introduction to Kinect - Update v 1.8
Matteo Valoriani
 
Kinect presentation
Kinect presentationKinect presentation
Kinect presentation
Ankur Sharma
 
Xbox 360 Kinect
Xbox 360 Kinect  Xbox 360 Kinect
Xbox 360 Kinect
Saif Pathan
 
Kinect
KinectKinect
Kinect
uzumakieha
 

Viewers also liked (7)

MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB Internals
 
Kinect
Kinect Kinect
Kinect
 
Kinect
KinectKinect
Kinect
 
Introduction to Kinect - Update v 1.8
Introduction to Kinect - Update v 1.8Introduction to Kinect - Update v 1.8
Introduction to Kinect - Update v 1.8
 
Kinect presentation
Kinect presentationKinect presentation
Kinect presentation
 
Xbox 360 Kinect
Xbox 360 Kinect  Xbox 360 Kinect
Xbox 360 Kinect
 
Kinect
KinectKinect
Kinect
 

Similar to Microsoft Hekaton

Journey and evolution of Presto@Grab
Journey and evolution of Presto@GrabJourney and evolution of Presto@Grab
Journey and evolution of Presto@Grab
Shubham Tagra
 
High-level architecture of a complete MariaDB deployment
High-level architecture of a complete MariaDB deploymentHigh-level architecture of a complete MariaDB deployment
High-level architecture of a complete MariaDB deployment
Federico Razzoli
 
Silverstripe at scale - design & architecture for silverstripe applications
Silverstripe at scale - design & architecture for silverstripe applicationsSilverstripe at scale - design & architecture for silverstripe applications
Silverstripe at scale - design & architecture for silverstripe applications
BrettTasker
 
Scaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays SingaporeScaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays Singapore
Angad Singh
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
Anna Ossowski
 
Apache airflow
Apache airflowApache airflow
Apache airflow
Purna Chander
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
Ankita Kapratwar
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
Ricardo Lourenço
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
ebs-performance-tuning-part-1-470542.pdf
ebs-performance-tuning-part-1-470542.pdfebs-performance-tuning-part-1-470542.pdf
ebs-performance-tuning-part-1-470542.pdf
ElboulmaniMohamed
 
Real-Time ETL in Practice with WSO2 Enterprise Integrator
Real-Time ETL in Practice with WSO2 Enterprise IntegratorReal-Time ETL in Practice with WSO2 Enterprise Integrator
Real-Time ETL in Practice with WSO2 Enterprise Integrator
WSO2
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Presto
PrestoPresto
Presto
Knoldus Inc.
 
Introduction to Postrges-XC
Introduction to Postrges-XCIntroduction to Postrges-XC
Introduction to Postrges-XC
Ashutosh Bapat
 
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsFunction Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
StreamNative
 
Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®
confluent
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
DataWorks Summit/Hadoop Summit
 
Java one2013
Java one2013Java one2013
Java one2013
Aleksei Kornev
 
My Sql Proxy
My Sql ProxyMy Sql Proxy
My Sql Proxy
Liu Lizhi
 

Similar to Microsoft Hekaton (20)

Journey and evolution of Presto@Grab
Journey and evolution of Presto@GrabJourney and evolution of Presto@Grab
Journey and evolution of Presto@Grab
 
High-level architecture of a complete MariaDB deployment
High-level architecture of a complete MariaDB deploymentHigh-level architecture of a complete MariaDB deployment
High-level architecture of a complete MariaDB deployment
 
Silverstripe at scale - design & architecture for silverstripe applications
Silverstripe at scale - design & architecture for silverstripe applicationsSilverstripe at scale - design & architecture for silverstripe applications
Silverstripe at scale - design & architecture for silverstripe applications
 
Scaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays SingaporeScaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays Singapore
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
 
Apache airflow
Apache airflowApache airflow
Apache airflow
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
ebs-performance-tuning-part-1-470542.pdf
ebs-performance-tuning-part-1-470542.pdfebs-performance-tuning-part-1-470542.pdf
ebs-performance-tuning-part-1-470542.pdf
 
Real-Time ETL in Practice with WSO2 Enterprise Integrator
Real-Time ETL in Practice with WSO2 Enterprise IntegratorReal-Time ETL in Practice with WSO2 Enterprise Integrator
Real-Time ETL in Practice with WSO2 Enterprise Integrator
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Presto
PrestoPresto
Presto
 
Introduction to Postrges-XC
Introduction to Postrges-XCIntroduction to Postrges-XC
Introduction to Postrges-XC
 
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsFunction Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
 
Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Java one2013
Java one2013Java one2013
Java one2013
 
My Sql Proxy
My Sql ProxyMy Sql Proxy
My Sql Proxy
 

Recently uploaded

Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 

Recently uploaded (20)

Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 

Microsoft Hekaton

  • 1. HEKATON SQL Server’s Memory-Optimized OLTP Engine Presented by: Prutha Date and Siraj Memon
  • 2. Outline ● Introduction ● Design Consideration ● High-Level Architecture ● Storage and Indexing ● Programmability and Query Processing ● Transaction Management and Logging ● Garbage Collection ● Experimental Results ● Conclusion ● Demo
  • 3. Introduction ● Database Engine: Optimized for Memory-resident data ● Targeted for OLTP workloads ● Integrated into SQL Server and uses T-SQL ● Fully transactional and durable ● Tables - Compiled into machine code ● Two Index Types: Hash Index and Range Index ● High-level of concurrency OLTP (Online Transaction Processing) T-SQL (Transact - Structured Query Language)
  • 4. Terminology ● Hekaton Table ● Hekaton Index ● Regular Table ● Regular Index ● Compiled Stored Procedure ● Interpreted Stored Procedure
  • 5. Competitors ● Commercial ● VoltDB ● SAP in-memory computing ● Oracle TimesTen ● IBM SolidDB ● Research ● Hyrise ● H-store ● HyPer
  • 6. Architectural Principles ● Optimize Indexes for main memory ● Uses lock-free hash tables and Bw-trees for optimized indexing ● Index operations not logged ● Rebuilding indexes during recovery ● Eliminate Latches and Lock ● Latch-free data structure – No latches or spinlocks ● Optimistic Multi-version concurrency control – transaction isolation ● Compile requests to native code ● Decisions: Compile time rather than Runtime ● Converts statements in T-SQL into customized, highly-efficient machine code
  • 7. Partitioning – We don’t like.. ● Problem with Partitioning ● Secondary Indexes ● Works great ONLY if workload is also partitionable ● Not sufficiently robust for SQL server ● Any thread can access any part of the database ● Single Shared hash table
  • 8. High Level Architecture ● Hekaton Storage Engine ● Manages user data and indexes ● Base mechanism for storage, check-pointing and high-availability ● Hekaton Compiler ● Abstract tree representation of T-SQL stored procedure ● Compiles the procedure into native code ● Hekaton Runtime System ● Integration with SQL Server resources ● Common library of additional functionality
  • 10. Storage and Indexing ● Two types of Index ● Hash Index: Lock-free hash tables ● Range Index: Bw-trees ● Use of Multiversioning – Updates create new version ● Reads: ● Read operation specifies a logical read time and only versions whose valid time overlaps the read time are visible to the read ● At most one version is visible ● Updates: ● Delete Old - Insert New
  • 11. Storage and Indexing (continued)
  • 13. Programmability and Query Processing ● Compile-once Execute-many-times ● High level of language compatibility ● Reuse of SQL Server T-SQL compilation stack ● Output of Hekaton compiler is C code ● Invoking the compiler: ● During creation of a memory optimized table ● During creation of a compiled stored procedure
  • 14. Schema Compilation ● Hekaton storage engine treats records as opaque objects ● Hekaton compiler provides the engine with customized callback functions for each table ● Task of Callback functions ● Computing a hash function on a key or record ● Comparing two records ● Serializing a record into a log buffer ● Callback functions are compiled into Native code which makes index operations extremely efficient
  • 15. Compiled Stored Procedure ● Compatibility issues between T-SQL and C datatypes ● Problem Solver: ● MAT (Mixed Abstract Tree) ● PIT (Pure Imperative Tree) ● Each operator implements a common interface so that they can be composed into arbitrarily complex plans ● Entire Query plan into a single function using labels and gotos ● Supports both blocking and non-blocking operators
  • 16. Example Fig.1: Sample T-SQL Procedure Fig.2: Query Plan Fig.3: Operator interconnections for Sample Procedure
  • 17. Query Interop ● Restrictions of Compiled Stored Procedures ● Supports limited set of options ● Stored procedures must execute in a predefined security context ● Must execute in the context of a single transaction ● Ad-hoc mechanism that enables conventional query execution engine to access memory optimized tables ● Features ● Import and Export for memory optimized tables ● Ad-hoc queries and data repair support ● Support for transactions that access both kind of tables ● Ease of app migration
  • 18. Transaction Management ● Hekaton utilizes optimistic multiversion concurrency control (MVCC) to provide snapshot, repeatable read and serializable transaction isolation without locking ● Serializable – guarantee that transaction will see exactly the same data if all its reads were repeated at the end of the transaction ● Properties to ensure serializability: ● Read stability ● Phantom avoidance ● Timestamps are used to specify ● Valid Time ● Logical Read Time ● Commit/End time ● Version visible if Begin Time < Read Time < Execution Time
  • 19. Transaction Commit Processing ● Validation and Dependencies ● Obtain End timestamp ● Validate for Read Stability and Phantom Avoidance ● Commit Dependency ● Dependency counter ● Read barrier ● Commit Logging and Post-Processing ● Changes to database are logged to transaction log ● Update versions with end timestamp of transactions ● Transaction Rollback ● Invalidate all versions created by the transaction using Write Sets.
  • 20. Transaction Durability ● Uses transaction logs and checkpoints to ensure durability ● Integrated with Always-On component that maintains highly available replicas ● Data on external storage consists of – ● Log streams (Logical effects of committed transactions to redo it) ● Checkpoint streams (Compressed representation of the log) ● Data Stream (all inserted versions during a timestamp interval) ● Delta Stream (a dense list of integers identifying deleted versions for its corresponding data stream) ● Note: Index operations are not logged; They are reconstructed on recovery.
  • 21. Transaction Logging and Checkpoints ● Transaction Logging ● One transaction – one log file ● Does not use WAL (Write-ahead logging) ● Uses a single log stream per database ● Checkpoints ● Continuous Checkpointing ● Streaming I/O ● Checkpoint Files and Checkpoint Process ● Recovery ● Parallelism within Hekaton ● Parallelism between SQL Server and Hekaton
  • 22. Garbage Collection ● Version of a record is garbage if it is no longer visible to any active transaction ● Properties of GC subsystem: Non-blocking, co-operative, incremental, parallelizable and scalable ● Garbage Correctness ● Version whose end timestamp < Oldest active transaction is not visible ● Version becomes garbage if - ●Deleted (Explicit DELETE or through UPDATE) ●Cannot be read or acted upon by any active transaction ●Transaction Rollback ● Garbage Removal ● Unlink from indexes ● Reclaim the version
  • 23. Experimental Results - CPU Efficiency Comparison of CPU efficiency for lookups Comparison of CPU efficiency for updates
  • 24. Experimental Results - Scaling Under Contention • Experiment illustrating scalability of Hekaton engine
  • 25. Conclusion ● Optimized in-memory OLTP workloads oriented database engine by Microsoft ● Fully integrated with SQL Server ● Uses latch-free data structures, multi-versioning concurrency control, compiled T-SQL stored procedure ● Ensure durability by logging and checkpointing ● High availability – SQL Server’s Always-On feature ● Order of magnitude improvement in efficiency and scalability with minimal changes to user applications.
  • 26. References ● http://vldb.org/pvldb/vol5/p298_per-akelarson_vldb2012.pdf ● http://nms.csail.mit.edu/~stavros/pubs/OLTP_sigmod08.pdf ● http://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/edbt09shor emt.pdf ● http://research.microsoft.com/pubs/178758/bw-tree-icde2013-final.pdf ● https://voltdb.com/ ● http://llvm.org/ ● http://www.oracle.com/technetwork/database/database- technologies/timesten/overview/index.html
  • 27. Demo