SortaSQL

SortaSQL
Ian Pye <ian@cloudﬂare.com>

Everyone likes SQL
• Tables
• Joins
• Online Transaction Processing
• Transactions!
• Arbitrary Queries

Scaling?
• What happens to joins when your data
doesn’t ﬁt in memory?
• I only need get and set for my data
• Sharding is too hard/unreliable
• A “monopolistically competitive market”?

Scaling

Seamless Horizontal Scalability from 1 to N

Proposal:
Let the Filesystem do the hard work
• RDBMS presents a full SQL interface to
applications, automatically accessing files to get
data as needed
• RDBMS stores metadata allowing it to find the
right data files
• Embedded key/value store handles the record
level storage, locking, caching, etc.
• FS (local or distributed) stores data and is
responsible for replication, performance, locking,
etc.

Major Wins
• Scales continuously from 1-100 servers (FS
permitting)
• Hot/cold storage hierarchy
• Allows ad-hoc queries via mature SQL
• Everyone already has built in bindings

Architecture

• Application Talks SQL to PostgreSQL
• PostgreSQL stores metadata
• Performs post processing on rows
retrieved from KC files
• KC files live on a POSIX filesystem

Architecture
Application

SQL

PostgreSQL

SortaSQL Plugin

libKC libKC libKC libKC

Kyoto Kyoto Kyoto Kyoto
Cabinet Cabinet Cabinet Cabinet

Filesystem Filesystem

Big Table
“A Bigtable is a sparse, distributed, persistent
multidimensional sorted map.”

Multi-Dimensional
• Storing values as protocol
buffers allow for arbitrarily
complex maps
• Logic so that when maps
get too big, they are
promoted to top level KC
stores

Persistent and Sorted
• Any Key/Value store which allows for
binary values accessed via a B+Tree of keys
will do
• We use Kyoto Cabinet (successor to Tokyo
Cabinet)

Sparse
• Values can be arbitrarily different.
• NULLs are free (or cheap)
• Protocol Buffers again to the rescue.

Distributed
• All about the ﬁlesystem here

Sharding Made Easy
• Fine grained metadata allowed for efﬁcient
storage hierarchy

SortaSQL: Summary

• BigTable like structure
• Accessed via SQL (PHP bindings come for free!)
• Ofﬂoad the hard part to the Filesystem Folks

Case Study: CloudFlare
• 400 GB data/day (Medium Data?)
• Facebook = 25 TB data/day
• USPS = 25.6 GB text data delivered/day
• Mix of Flash and Magnetic storage
• Mirrored
• Fixed user queries
• Random BizDev queries

Metadata

Partitioned by owner, period and data type

Disk Layout

• 2 80GB SSDs (small
and blazing)
• 2.5T RAID5 (big and
slow)

First Steps
(access some records)

Window Functions are
Your Friends

NoSQL to MySQL with
Memcached
• Replace Language not
Storage Engine
• Speak Memcached not
SQL

In Context
• https://github.com/cloudﬂare/SortaSQL
• http://dev.mysql.com/tech-resources/articles/
nosql-to-mysql-with-memcached.html
• http://queue.acm.org/detail.cfm?id=1961297

SortaSQL

More Related Content

What's hot

Viewers also liked

Similar to SortaSQL

More from Cloudflare

Recently uploaded

SortaSQL

Editor's Notes