2. vFabric: What’s in it?
Rich Integration Batch Spring
Frameworks & Tools Web
Social and Data
Patterns Framework Tool Suite
Mobile Access
vFabric Application Srv Web Runtime Messaging Elastic Data Grid DBaaS
Perf, Mgmt
Application Services tc Server ERS vPostgres
RabbitMQ Gemfire / SQLFire Hyperic / Insight
EM4J Data vCops/
Director APM
vSphere 5
2
7. Traditional DB Characteristics
§ Designed against no § Centralized in nature
longer relevant • Data change capture an
constraints afterthought
• Network unreliable/slow • Lacks data partitioning
• RAM prices prohibitive facilities
§ One size fits all § Obsessed with ACID
• Designed for everything, • Constant contention for
optimized for nothing resources cause locks
• Often incompatible with § Monolithic design
modern workloads § Requires lots of hardware to
scale
7
8. Traditional DB Loves IO
Buffers primarily
tuned for IO
First write
to LOG
Second write
to Data files
8
9. Transaction in Traditional DB
Source: Research by MIT and Brown: “OLTP Under
the Looking Glass” by S. Harizopoulos, D. J. Abadi,
S. Madden, M. Stonebraker, SIGMOD 2008.
12%
30%
8% Data
Percentage of Btrees keys
Computer cycles Logging
based on 3.5M
sample
Locking
21% Latching
10% Buffer management
19%
9
13. New Approach
Elastic, in-memory database
designed specifically for
speed and low latency
accessible through a
familiar SQL interface.
13
14. SQLFire Characteristics
§ Highly concurrent data § Shared nothing logs on
structures resident in and disk; application writes are
optimized for main memory never exposed to the disk
§ Rethink ACID transactions; seek latencies
all state resides in § Parallelize data access
distributed memory to avoid and application behavior;
any single points of dynamically “shard SQL”
contention § Dynamic rebalancing of
§ Partition-aware DB design data as cluster size grows/
spreads workloads across shrinks. Most efficient way
both data set and physical of managing resources/
nodes data.
14
21. Why Scale Horizontally?
Sub-divide system into
independent data sets, eliminate
distributed transactions to
achieve elasticity,
linear scalability
and predictable
latency.
21
22. Horizontal Scalability – Throughput
800000 1400
700000
1200
600000
Queries per second
1000
Client threads
500000
800
400000
queriesPerSecond
600
client threads
300000
400
200000
200
100000
0 0
2 4 6 8 10
Number of servers
22
23. Horizontal Scalability – Consistency/HA
§ Resiliency through replication, synchronous but in parallel
§ Row updates are always atomic; no need for transactions
§ Shared nothing architecture, including storage
§ Instant failover at protocol level
§ Apps retain their connections
§ Data remains available
APP
SQLFire SQLFire SQLFire
23
25. Data strategies – Partitioning
§ Balances data across SQLFire cluster
§ Delivers redundancy for high availability
APP
SQLFire SQLFire SQLFire
Write operation (with 2 redundant copies)
Read operation
25
26. SQLFire Hash Partitioning
§ Partition by column or primary key
• Can specify multiple columns
• Uses hashCode() for single column or primary key
• Uses serialized bytes for multiple columns
• Creates uniform distribution of data across the cluster
// Partition by column
CREATE TABLE MY_TABLE
( . . . ) PARTITION BY COLUMN ( COLUMN_A)
// Partition by primary key
CREATE TABLE MY_TABLE
( . . . ) PARTITION BY PRIMARY KEY
26
27. SQLFire Range Partitioning
§ Partition by range of column values
• Can specify multiple ranges
• Colocates data in specified ranges
• Used to ensure locality of data in a partition for range queries or cross table
joins
// Partition by range
CREATE TABLE MY_TABLE
( . . . ) PARTITION BY RANGE ( COLUMN_A)
(
VALUES BETWEEN 1 AND 10,
VALUES BETWEEN 50 AND 60
)
27
28. SQLFire List Partitioning
§ Partition by a set of column values
• Can specify column value sets
• Colocates data with specified column values
• Used to ensure locality of data in a partition for sets of values or cross table
joins
// Partition by list
CREATE TABLE MY_TABLE
( . . . ) PARTITION BY LIST ( COLUMN_A)
(
VALUES (‘VALUE_A’, ‘VALUE_B’),
VALUES (‘VALUE_Y’, ‘VALUE_Z’)
)
28
29. SQLFire Expression Partitioning
§ Partition by a column expression
• Expression must be valid SQL function
• Must reference only columns in the table
• Hash partition with value determined by the expression
// Partition by expression
CREATE TABLE MY_TABLE
( . . . ) PARTITION BY ( MONTH ( MY_DATE ) )
29
30. SQLFire Default Partitioning
§ Default hash partitioning strategy
• Start server with table-default-partitioned property set to true!
• First foreign key whose referenced primary key is also a partition column
• Primary key
• First unique key
• SQLFire-generated row id
// No PARTITION BY clauses
CREATE TABLE MY_TABLE
(COLUMN_A INT NOT NULL CONSTRAINT A_PK PRIMARY
KEY, . . .)
CREATE TABLE MY_OTHER_TABLE
(COLUMN_B INT NOT NULL CONSTRAINT B_PK PRIMARY KEY,
COLUMN_C INT CONSTRAINT A_FK REFERENCES MY_TABLE
(COLUMN_A), . . .)
30
31. Data strategies – Replication
§ Copies all data across SQLFire cluster
§ Appropriate for reference data
APP
SQLFire SQLFire SQLFire
Write operation (with replicated copies)
Read operation
31
32. SQLFire Replicated Tables
§ Created by default with no PARTITION BY clause
§ Created with REPLICATE clause
§ Reference data or fact tables are good candidates
§ Replicates data across all peers in server group
§ Replication is parallel and synchronous
§ Automatic replication failure detection
// Replication example
CREATE TABLE MY_TABLE
( . . . )
REPLICATE
32
37. Synchronous strategy
In data-center or over private network
JVM JVM JVM JVM
APP APP APP APP
SQLFire Locator SQLFire Locator
JVM JVM JVM JVM JVM JVM
SQLFire SQLFire SQLFire SQLFire SQLFire SQLFire
Redundancy Zone A Redundancy Zone B
Site 1 Site 2
37
38. Asynchronous strategy
Multi-site over the Cloud
JVM JVM JVM JVM
APP APP APP APP
SQLFire Locator WAN SQLFire Locator
Gateway
JVM JVM JVM JVM JVM JVM
SQLFire SQLFire SQLFire SQLFire SQLFire SQLFire
Site 1 Site 2
38
39. Data strategies – Server Groups
SQLFire Cluster
JVM JVM JVM
Group 1
SQLFire SQLFire SQLFire
JVM JVM JVM
Group 2
SQLFire SQLFire SQLFire
Group 3
39