Ramzi Alqrainy Sep 9, 2015
MemSQL
The Fastest In-Memory Database
• Experienced leadership from Facebook, SQL Server,
Oracle, Fusion-io
• In-Memory, distributed, relational database
• Solving the Enterprise Architecture Gap
• Horizontal scale-out with modern database innovation
• $50 million in funding
MemSQL the company
Going Real-Time is the Next Phase for Big Data
• Search of consumer data storage
• Key challenges: not all users are equals. Users grow and change all the
time
• Petabytes of data, millions of users, 1000’s of nodes
• Learn more: https://www.youtube.com/watch?v=_Erkln5WWLw

and

http://www.slideshare.net/lucidworks/scaling-solrcloud-to-a-large-
number-of-collections-shalin-shekhar-mangar
More
Sensors
More
Interconnectivity
More
User Demand
…and companies are at risk of being left behind
Current Data Management Challenges
ETL
Batch Processing
Big Iron Appliances
MemSQL the software
• Distributed and Parallel
• Shared-Nothing, Lock-Free
• Data in memory or SSD
• SQL all the way down
MemSQL Engine: “memsqld”
• Basic scaling unit of a cluster
• A full, independent RDBMS
• 50,000 inserts / sec on wide table
• ~1M inserts / sec on skinny table
• Millions of primary-key lookups / sec
MemSQL
MemSQL Engine: Aggregators Aggregate
Agg 1 Agg 2
Leaf 1 Leaf 2 Leaf 3 Leaf 4
MemSQL Engine: Leaves Hold Data
Agg 1 Agg 2
Leaf 1 Leaf 2 Leaf 3 Leaf 4
MemSQL Engine: Sharding and Joins
Agg 1 Agg 2
Leaf 1 Leaf 2 Leaf 3 Leaf 4
select	
  *	
  from	
  lineitem	
  L,	
  orders	
  O	
  
where	
  L.orderkey	
  =	
  O.orderkey...	
  
leaf1>	
  using	
  memsql_demo_0	
  
select	
  *	
  from	
  lineitem	
  L,	
  orders	
  O	
  
where	
  L.orderkey	
  =	
  O.orderkey...	
  	
  
leaf2>	
  using	
  memsql_demo_1	
  
select	
  *	
  from	
  lineitem	
  L,	
  orders	
  O	
  
where	
  L.orderkey	
  =	
  O.orderkey...	
  
MemSQL Engine: Compiled Queries
Parse
In Cache?
Execute
Codegen
select	
  *	
  from	
  foo	
  where	
  id=1234	
  
and	
  name	
  like	
  ‘%jingleheimer%’;	
  
SELECT	
  *	
  FROM	
  foo	
  WHERE	
  id	
  =	
  @	
  
AND	
  name	
  LIKE	
  ^
Durability: Transactions (MVCC)
v1
v2
v3
v0
v4
readers
readers
writer readers
(waiting writer)
• Every write creates a new version of row
• Old versions get garbage-collected
• Reads are never blocked
• Row-level locking for writes
• Allows online ALTER TABLE!
• Multi-statement transactions
Durability: Logging and Snapshots
Every write saved to transaction log on disk:
/var/lib/memsql/data/logs	
  
Periodic compaction into a snapshot file:
/var/lib/memsql/data/snapshots	
  
On restart data is loaded into RAM
Latest two snapshots are kept by default
Leaf 1 Leaf 2 Leaf 4Leaf 3
Agg 1 Agg 2
Durability: High Availability
Leaves are paired up
Partitions replicated async
Automatically fails over
Uses 2X space
Licensing
Community Edition
• Free Forever, Unlimited Scale
• Full SQL features
Enterprise Edition
• Subscription basis, by RAM capacity
• No limit on disk storage
• Enterprise support
• Replication / High Availability
DEMO

MemSQL

  • 1.
    Ramzi Alqrainy Sep9, 2015 MemSQL The Fastest In-Memory Database
  • 2.
    • Experienced leadershipfrom Facebook, SQL Server, Oracle, Fusion-io • In-Memory, distributed, relational database • Solving the Enterprise Architecture Gap • Horizontal scale-out with modern database innovation • $50 million in funding MemSQL the company
  • 3.
    Going Real-Time isthe Next Phase for Big Data • Search of consumer data storage • Key challenges: not all users are equals. Users grow and change all the time • Petabytes of data, millions of users, 1000’s of nodes • Learn more: https://www.youtube.com/watch?v=_Erkln5WWLw
 and
 http://www.slideshare.net/lucidworks/scaling-solrcloud-to-a-large- number-of-collections-shalin-shekhar-mangar More Sensors More Interconnectivity More User Demand …and companies are at risk of being left behind
  • 4.
    Current Data ManagementChallenges ETL Batch Processing Big Iron Appliances
  • 5.
    MemSQL the software •Distributed and Parallel • Shared-Nothing, Lock-Free • Data in memory or SSD • SQL all the way down
  • 6.
    MemSQL Engine: “memsqld” •Basic scaling unit of a cluster • A full, independent RDBMS • 50,000 inserts / sec on wide table • ~1M inserts / sec on skinny table • Millions of primary-key lookups / sec MemSQL
  • 7.
    MemSQL Engine: AggregatorsAggregate Agg 1 Agg 2 Leaf 1 Leaf 2 Leaf 3 Leaf 4
  • 8.
    MemSQL Engine: LeavesHold Data Agg 1 Agg 2 Leaf 1 Leaf 2 Leaf 3 Leaf 4
  • 9.
    MemSQL Engine: Shardingand Joins Agg 1 Agg 2 Leaf 1 Leaf 2 Leaf 3 Leaf 4 select  *  from  lineitem  L,  orders  O   where  L.orderkey  =  O.orderkey...   leaf1>  using  memsql_demo_0   select  *  from  lineitem  L,  orders  O   where  L.orderkey  =  O.orderkey...     leaf2>  using  memsql_demo_1   select  *  from  lineitem  L,  orders  O   where  L.orderkey  =  O.orderkey...  
  • 10.
    MemSQL Engine: CompiledQueries Parse In Cache? Execute Codegen select  *  from  foo  where  id=1234   and  name  like  ‘%jingleheimer%’;   SELECT  *  FROM  foo  WHERE  id  =  @   AND  name  LIKE  ^
  • 11.
    Durability: Transactions (MVCC) v1 v2 v3 v0 v4 readers readers writerreaders (waiting writer) • Every write creates a new version of row • Old versions get garbage-collected • Reads are never blocked • Row-level locking for writes • Allows online ALTER TABLE! • Multi-statement transactions
  • 12.
    Durability: Logging andSnapshots Every write saved to transaction log on disk: /var/lib/memsql/data/logs   Periodic compaction into a snapshot file: /var/lib/memsql/data/snapshots   On restart data is loaded into RAM Latest two snapshots are kept by default
  • 13.
    Leaf 1 Leaf2 Leaf 4Leaf 3 Agg 1 Agg 2 Durability: High Availability Leaves are paired up Partitions replicated async Automatically fails over Uses 2X space
  • 14.
    Licensing Community Edition • FreeForever, Unlimited Scale • Full SQL features Enterprise Edition • Subscription basis, by RAM capacity • No limit on disk storage • Enterprise support • Replication / High Availability
  • 15.