Polyglot Persistence

Polyglot Persistence
Scott Leberknight

http://memeagora.blogspot.com/2006/12/polyglot-programming.html
Neal Ford
December 2006
Polyglot Programming

http://www.amazon.com/Paradox-Choice-Why-More-Less/dp/0060005688

http://java-source.net/open-source/web-frameworks

InitialContext ic = new InitialContext();
DataSource ds = ic.lookup("java:comp/env/jdbc/cof
Connection con = null;
Statement stmt = null;
ResultSet rs = null;
try {
con = ds.getConnection();
stmt = con.createStatement();
rs = stmt.executeQuery("select name, price from
List<Coffee> coffees = new ArrayList<Cofee>();
while (rs.next()) {
String name = rs.getString("name");
float price = rs.getFloat("price");
coffees.add(new Coffee(name, price);
}
} catch (SQLException sqlex) {
...and now
PERSISTENCE

Why?
Scalability
(on massive scales)
High availability
New types of apps,
e.g. social networking
Fault tolerance Distributability
Flexibility
(i.e. "schemaless")

Why?
One size does not ﬁt all

Relational
Document
Oriented
Object
Bigtable-ish
A few types of Databases...
Key-value
EAV
(Entity-Attribute-Value)

Structured
Semi-Structured
UnstructuredTypes of data

ACID
Atomic
Consistent
Isolated
Durable

ACID in Action
1st Bank
checking savings
customers
Transfer
$1000 from
1st Bank
checking to
savings

BASE
Basically Available
Soft State
Eventually Consistent

BASE in Action
1st Bank
checking savings
customers
Transfer $1000 from
1st Bank checking to
Bank of Foo savings
Bank of Foo
account account_type
customer

Schedule, Cost, Quality
(choose any 2)

"When designing distributed web services, there
are three properties that are commonly desired:
consistency, availability, and partition tolerance.
It is impossible to achieve all three."
- "Brewer's Conjecture and the Feasibility of Consistent,
Available, Partition-Tolerant Web Services"
Seth Gilbert and Nancy Lynch (MIT)

Consistency
Partition-tolerance
Availability
(choose any 2)

We're living in interesting times...
Explosion of alternative persistence choices
Completely new philosophies on persistence

Whirlwind tour...
Relational
Document-Oriented
Key/Value
Bigtable

Relational
Databases
blog blog_entry blog_entry_comment
category
daily_statistics
blog_owner
blog_user

Relations
(tables, joins, integrity)
ACID guarantees
Query using SQL Strict schema
Difﬁcult to scale,
partition
(e.g. 2-phase commit)
By far most popular persistence choice today
Mismatch with
OO languages

select *
from fakenames f
where f.surname like 'Smi%'
and f.city = 'Richmond'
and f.state = 'VA'
order by f.surname, f.given_name;
28

Scaling...
Buy a bigger machine
(vertical scaling)

What if there is no bigger machine?
Horizontal scaling:
Functional
Sharding

Users 0
Users 1
Products 0 Orders 0
Orders 1
Orders 2
Functional
Shards

"As opposed to Relational Databases, document-based
databases do not store data in tables with uniform sized
fields for each record. Instead, each record is stored as a
document that has certain characteristics. Any number of
fields of any length can be added to a document. Fields can
also contain multiple pieces of data."
- Wikipedia
(http://en.wikipedia.org/wiki/Document-oriented_database)

Examples:
Lotus Notes
Apache CouchDB
Amazon SimpleDB
(for our purposes anyway)
MongoDB

Concepts:
Documents
Views
Schemaless
Distributed

Views
JavaScript as description language
Map/Reduce functions
Add structure to semi-structured data
Independent of actual documents
(created in special Design Documents)

function(doc) {
emit(null, doc);
}
42
Simplest map function...

// Map function to find Seattlites
function(doc) {
if (doc.State == "WA" && doc.City == "Seattle") {
emit(doc.Number,
{ "GivenName":doc.GivenName, "Surname":doc.Surname });
}
}
43

// Map function
function(doc) {
emit(doc.State, 1);
}
// Reduce function; aggregates counts
function (key, values) {
return sum(values);
}
44
Counting people by state...

Views are not meant to be created
dynamically like SQL queries!
Caution:
To keep view querying fast, the view engine maintains
indexes of its views, and incrementally updates them to
reﬂect changes in the database. CouchDB’s core
design is largely optimized around the need for
efﬁcient, incremental creation of views and
their indexes.
- http://couchdb.apache.org/docs/overview.html

"Amazon SimpleDB is a web service for running queries on
structured data in real time. This service works in close
conjunction with Amazon Simple Storage Service (Amazon S3)
and Amazon Elastic Compute Cloud (Amazon EC2), collectively
providing the ability to store, process and query data sets in
the cloud. These services are designed to make web-scale
computing easier and more cost-effective for developers."
- SimpleDB Developer Guide
(Version 2007-11-07)

"A traditional, clustered relational database requires a sizable
upfront capital outlay, is complex to design, and often requires a
DBA to maintain and administer.Amazon SimpleDB is
dramatically simpler, requiring no schema, automatically
indexing your data and providing a simple API for storage
and access.This approach eliminates the administrative
burden of data modeling, index maintenance, and performance
tuning. Developers gain access to this functionality within
Amazon’s proven computing environment, are able to scale
instantly, and pay only for what they use."
(Version 2007-11-07)

Organize data into domains
Domains have items
Items have attributes
Attributes have value(s)

Domain: Fakenames
"5"
"6/6/1941"
"Gwendolyn"
EmailAddress
"Michael"
"1"
"9/5/1982"
"Chris"
"David"
"11/18/1963""3"
"Swinton"
ID
"Vera"
"Johnson"
Birthday
"vsutton@coldmail.com"
"Vera.M.Sutton@dodgit.com"
"4"
GivenName
"9/20/1951""gswinton@dodgit.com"
"Lewis"
"2"
"mjohnson@stopit.com"
"michael.johnson@yaboo.com"
"Sutton"
"7/14/1952"
"david.schuler@goofymail.com"
"dschuler@yaboo.com"
"schulerd@xyzco.com"
Surname
"Schuler"
Items
Attributes
Values

Domain: Amazon
"Full Screen"
"Mens"
"Entertainment"
Color Size Length
"DVDs"
"White"
"Yellow"
"Beige"
"Pink"
Format
"Clothes"
"Blue"
"Gray"
"Black"
"Books"
"Sound of
Music"
"Item03"
"Blouse"
"Item02"
"Full Screen"
"Widescreen"
"Entertainment" "174 min"
SubcategoryID Author
"Kurt
Vonnegut "
"Womens"
"Item04"
"Item05"
"Item01" "Pulp Fiction""DVDs"
Name
"Small"
"Medium"
"Large"
"Slaugherhouse
Five"
Category
"Clothes"
"Entertainment"
"154 min"
"168 min
(special
edition)"
"30x30"
"32x30"
"34x30"
...
"Jeans"

"REST" API
POST / HTTP/1.1
Content-Type: application/x-www-form-urlencoded; charset=utf-8
User-Agent: Amazon Simple DB Java Library
Host: sdb.amazonaws.com
Content-Length: 232
Action=CreateDomain&
DomainName=Fakenames&
AWSAccessKeyId=[your AWS access key id]&
SignatureVersion=2&
SignatureMethod=HmacSHA256&
Signature=[computed signature]&
Timestamp=2009-03-23T23%3A58%3A55.327Z&
Version=2007-11-07

Available APIs:
Java C#
Perl PHP
VB
Ruby gems:
aws-simpledb
aws-sdb
simpledb
Amazon
3rdparty
Python:
polarrose-twisted-amazon

AmazonSimpleDB service =
new AmazonSimpleDBClient(accessKeyId, secretAccessKey);
// Create a new domain
CreateDomainRequest cdReq =
new CreateDomainRequest().withDomainName("Fakenames");
CreateDomainResponse cdResp = service.createDomain(cdReq);
// List all our domains
ListDomainsRequest ldReq = new ListDomainsRequest();
ListDomainsResponse ldResp = service.listDomains(ldReq);
54

Sample response:
<ListDomainsResponse
xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">
<ListDomainsResult>
<DomainName>
Fakenames
</DomainName>
<DomainName>
Movies
</DomainName>
</ListDomainsResult>
<ResponseMetadata>
<RequestId>
8c4d0240-49ea-5d2f-9573-437324cd144c
</RequestId>
<BoxUsage>
0.0000071759
</BoxUsage>
</ResponseMetadata>
</ListDomainsResponse>

// Add an attribute value
ReplaceableAttribute newEmail =
new ReplaceableAttribute("emailAddress",
"bortiz@spammail.com",
false);
PutAttributesRequest request =
new PutAttributesRequest()
.withDomainName("Fakenames")
.withItemName("1")
.withAttribute(newEmail);
PutAttributesResponse response = service.putAttributes(request);
56

// Query for Richmonders
String query =
"['city' = 'Richmond'] intersection ['state' = 'VA']";
QueryRequest request = new QueryRequest()
.withQueryExpression(query);
QueryResponse response = service.query(request);
58

// Query for Richmonders, with attributes
String query =
"['city' = 'Richmond'] intersection ['state' = 'VA']";
QueryWithAttributesRequest request =
new QueryWithAttributesRequest()
.withQueryExpression(query);
QueryWithAttributesResponse response =
service.query(request);
59

// Get a count
String query = "select count(*) from Fakenames";
SelectRequest request =
new SelectRequest().withSelectExpression(query);
SelectResponse response = service.select(request);
61

// Select Richmonders
String query = "select * from Fakenames"
+ " where city = 'Richmond' intersection state = 'VA'"
+ " intersection surname like 'Smi%'";
SelectRequest request =
new SelectRequest().withSelectExpression(query);
SelectResponse response = service.select(request);
62

There are Limits!
Query execution time <= 5 sec
Max items in query response = 250
See SimpleDB Developer Guide for more...
Size limits <= 1024 bytes
Attribute limit per item <= 256

(May I have another?)
<QueryResponse
xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">
<QueryResult>
<ItemName>
131
</ItemName>
...
<NextToken>
rO0ABXNyACdjb20uYW1hem9uLnNkcy5RdWVyeVByb2Nlc3Nvci5Nb3JlVG9r
racXLnINNqwMACkkAFGluaXRpYWxDb25qdW5jdEluZGV4WgAOaXNQYWdlQm91bmRhc
...
</NextToken>
</QueryResult>
<ResponseMetadata>
...
</ResponseMetadata>
</QueryResponse>
NextToken

Eventually consistent(*)
"Amazon SimpleDB keeps multiple copies of each
domain.When data is written or updated...all copies of
the data are updated. However, it takes time for the
data to propagate to all storage locations.The data will
eventually be consistent, but an immediate read
might not show the change. Consistency is usually
reached within seconds, but a high system load or
network partition might increase this time. Performing
a read after a short period of time should return the
updated data."
(Version 2007-11-07)

(*) ConsistentRead
Version 2009-04-15 added consistent read option
"If eventually consistent reads are not
acceptable for your application, use
ConsistentRead.Although this operation
might take longer than a standard read, it
always returns the last updated value."
(Version 2009-04-15)

Distributed Key -
Value Stores

value = store.get(key)
store.put(key, value)
store.remove(key)
68
Basically...

Data stored as
key/value pairs
"A big hashtable"
Replication Fault tolerance
Data consistency &
versioning
Horizontal
scaling

Amazon Dynamo
(a real-world example)

Distributed key-value
storage system
Used by Amazon core and web services
(e.g. your Amazon shopping cart...)
Massively
scaleable
Fault tolerant
Eventually
consistent

The-Project-Which-Must-
Not-Be-Named
(ProjectVoldemort)

What is it?
"a distributed key-value storage system"
automatic replication across multiple servers
transparent server failure handling
automatic data item versioning

"Voldemort is not a relational database, it does not
attempt to satisfy arbitrary relations while satisfying ACID
properties. Nor is it an object database that attempts to
transparently map object reference graphs. Nor does it
introduce a new abstraction such as document-
orientation. It is basically just a big, distributed,
persistent, fault-tolerant hash table."
http://project-voldemort.com/

designed for horizontal scaling
used at LinkedIn "for certain high-scalability
storage problems where simple functional
partitioning is not sufﬁcient"

"Consistent hashing"
No single server holds all data
Data partitioned across multiple servers
Versioning using "vector clocks"

Configuration:
cluster.xml describes cluster
(servers, data partitions)
stores.xml describes data stores
(persistence, routing, key/value data format, replication factor,
preferred reads/writes, required reads/writes)

<cluster>
<name>mycluster</name>
<server>
<id>0</id>
<host>localhost</host>
<http-port>8081</http-port>
<socket-port>6666</socket-port>
<partitions>0, 1, 2, 3</partitions>
</server>
<server>
<id>1</id>
<host>localhost</host>
<http-port>8082</http-port>
<socket-port>6667</socket-port>
<partitions>4, 5, 6, 7</partitions>
</server>
</cluster>
78
sample cluster.xml

<stores>
<store>
<name>people</name>
<persistence>bdb</persistence>
<routing>client</routing>
<replication-factor>3</replication-factor>
<preferred-reads>3</preferred-reads>
<required-reads>2</required-reads>
<preferred-writes>2</preferred-writes>
<required-writes>1</required-writes>
<key-serializer>
<type>json</type>
<schema-info>"string"</schema-info>
</key-serializer>
<value-serializer>
<type>json</type>
<schema-info>{"GivenName":"string", "Surname":"string"}</schema-info>
</value-serializer>
</store>
</stores>
79
sample stores.xml

> locate "1"
Node 0
host: localhost
port: 6666
available: yes
last checked: 96171 ms ago
Node 1
host: localhost
port: 6667
available: yes
Node 2
host: localhost
port: 6668
available: yes
80
replication

$ ./voldemort-shell.sh people tcp://localhost:6666
Established connection to people via tcp://localhost:6666
> put "1" { "GivenName":"Bob", "Surname":"Smith" }
> get "1"
version(0:1): {"GivenName":"Bob", "Surname":"Smith", }
> put "1" { "GivenName":"Robert", "Surname":"Smith", }
> get "1"
version(0:2): {"GivenName":"Robert", "Surname":"Smith", }
81
vector clock
(master node: version)

StoreClientFactory factory =
new SocketStoreClientFactory(numThreads,
numThreads, maxQueuedRequests, maxConnectionsPerNode,
maxTotalConnections, bootstrapUrl);
StoreClient<Integer, Map<String, Object>> client =
factory.getStoreClient("fakenames");
// Update a value
Versioned versioned = client.get(1);
Map<String, Object> person = versioned.getValue();
person.put("EmailAddress", newEmailAddr);
versioned.setObject(person);
client.put(1, versioned);
82
Java API example

- Bigtable:A Distributed Storage System
for Structured Data
http://labs.google.com/papers/bigtable.html
"Bigtable is a distributed storage
system for managing structured data
that is designed to scale to a very
large size: petabytes of data across
thousands of commodity
servers. Many projects at Google
store data in Bigtable including web
indexing, Google Earth, and Google
Finance."

"A Bigtable is a sparse, distributed, persistent
multidimensional sorted map"
for Structured Data

distributed
sparse
column-oriented
versioned

(row key, column key, timestamp) => value
The map is indexed by a row key,
column key, and a timestamp; each
value in the map is an uninterpreted array
of bytes.
for Structured Data

Key Concepts:
row key => 20090407152657
column family => "name:"
column key => "name:ﬁrst", "name:last"
timestamp => 1239124584398

Row Key Timestamp Column Family "info:"Column Family "info:" Column Family "content:"
20090407145045 t7 "info:summary" "An intro to..."20090407145045
t6 "info:author" "John Doe"
20090407145045
t5 "Google's Bigtable is..."
20090407145045
t4 "Google Bigtable is..."
20090407145045
t3 "info:category" "Persistence"
20090407145045
t2 "info:author" "John"
20090407145045
t1 "info:title" "Intro to Bigtable"
20090320162535 t4 "info:category" "Persistence"20090320162535
t3 "CouchDB is..."
20090320162535
t2 "info:author" "Bob Smith"
20090320162535
t1 "info:title" "Doc-oriented..."

Row Key Timestamp Column Family "info:"Column Family "info:" Column Family "content:"
20090407145045 t7 "info:summary" "An intro to..."20090407145045
t6 "info:author" "John Doe"
20090407145045
t5 "Google's Bigtable is..."
20090407145045
t4 "Google Bigtable is..."
20090407145045
t3 "info:category" "Persistence"
20090407145045
t2 "info:author" "John"
20090407145045
t1 "info:title" "Intro to Bigtable"
20090320162535 t4 "info:category" "Persistence"20090320162535
t3 "CouchDB is..."
20090320162535
t2 "info:author" "Bob Smith"
20090320162535
t1 "info:title" "Doc-oriented..."
Ask for row 20090407145045...

Apache HBase
(an open source Bigtable implementation)

HBase uses a data model very similar to that of Bigtable.
Applications store data rows in labeled tables.A data row
has a sortable row key and an arbitrary number of
columns.The table is stored sparsely, so that rows in
the same table can have widely varying numbers of
columns.
- http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture

hbase(main):001:0> create 'blog', 'info', 'content'
0 row(s) in 4.3640 seconds
hbase(main):002:0> put 'blog', '20090320162535', 'info:title', 'Document-oriented
storage using CouchDB'
hbase(main):003:0> put 'blog', '20090320162535', 'info:author', 'Bob Smith'
hbase(main):004:0> put 'blog', '20090320162535', 'content:', 'CouchDB is a
document-oriented...'
hbase(main):005:0> put 'blog', '20090320162535', 'info:category', 'Persistence'
hbase(main):006:0> get 'blog', '20090320162535'
COLUMN CELL
content: timestamp=1239135042862, value=CouchDB is a doc...
info:author timestamp=1239135042755, value=Bob Smith
info:category timestamp=1239135042982, value=Persistence
info:title timestamp=1239135042623, value=Document-oriented...
94
HBase Shell

hbase(main):015:0> get 'blog', '20090407145045', {COLUMN=>'info:author', VERSIONS=>3 }
timestamp=1239135325074, value=John Doe
timestamp=1239135324741, value=John
hbase(main):016:0> scan 'blog', { STARTROW => '20090300', STOPROW => '20090400' }
ROW COLUMN+CELL
20090320162535 column=content:, timestamp=1239135042862, value=CouchDB is...
20090320162535 column=info:author, timestamp=1239135042755, value=Bob Smith
20090320162535 column=info:category, timestamp=1239135042982, value=Persistence
20090320162535 column=info:title, timestamp=1239135042623, value=Document...
95

// Create a new table
HBaseAdmin admin = new HBaseAdmin(new HBaseConfiguration());
HTableDescriptor descriptor = new HTableDescriptor("mytable");
descriptor.addFamily(new HColumnDescriptor("family1:"));
admin.createTable(descriptor);
97

// Add some data into 'mytable'
HTable table = new HTable("mytable");
BatchUpdate update = new BatchUpdate("row1");
update.put("family1:aaa", Bytes.toBytes("some value"));
table.commit(update);
// Get data back
RowResult result = table.getRow("row1");
Cell cell = result.get("family1:aaa");
// Overwrite earlier value and add more data
BatchUpdate update2 = new BatchUpdate("row1");
update2.put("family1:aaa", Bytes.toBytes("some value"));
update2.put("family2:bbb", Bytes.toBytes("another value"));
table.commit(update2);
98

Finding data:
get (by row key)
scan (by row key ranges, ﬁltering)
Secondary indexes allow scanning
by different keys
(a bit more ﬂexibility, requires more storage)

// Scan for people born during January 1960
HTable table = new HTable("fakenames");
byte[][] columns =
Bytes.toByteArrays(new String[]{ "name:", "gender:" });
byte[] startRow = Bytes.toBytes("19600101");
byte[] endRow = Bytes.toBytes("19600201");
Scanner scanner = table.getScanner(columns, startRow, endRow);
for (RowResult result: scanner) {
...
}
scanner.close();
100

one size does
not ﬁt all
lots of alternatives
think about what you
really need...
(not what's currently "hot")

What do you really need?
distributed
deployment?
fault
tolerance?
query
richness?
schema
evolution?
extreme
scalability?
ability to enforce
relationships?
ACID or BASE?
key/value
storage?

Even more alternatives...
XML databases
Semantic Web / RDF / Triplestores
Graph databases
Tuplespaces

General
Polyglot Persistence
http://www.sleberknight.com/blog/sleberkn/entry/polyglot_persistence
Database Thaw
http://martinfowler.com/bliki/DatabaseThaw.html
Application Design in the context of the shifting storage spectrum
http://qconsf.com/sf2008/presentation/Application+Design+in+the+context+of+the+shifting+storage+spectrum
BASE:An Acid Alternative
http://queue.acm.org/detail.cfm?id=1394128
The Challenges of Latency
http://www.infoq.com/articles/pritchett-latency
One size fits all:A concept whose time has come and gone
http://www.databasecolumn.com/2007/09/one-size-fits-all.html
http://www.cs.brown.edu/~ugur/fits_all.pdf
The End of an Architectural Era (It's Time for a Complete Rewrite)
http://db.cs.yale.edu/vldb07hstore.pdf
Brewer’s Conjecture and the Feasibility of Consistent,Available, Partition-Tolerant Web Services
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.1495

General
Semi-Structured Data
http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/toc.html
Latency is Everywhere and it CostsYou Sales - How to Crush it
http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it
QCon London 2009: Database projects to watch closely
http://gojko.net/2009/03/11/qcon-london-2009-database-projects-to-watch-closely
Memories, Guesses, and Apologie
http://blogs.msdn.com/pathelland/archive/2007/05/15/memories-guesses-and-apologies.aspx
Column-oriented databases
http://en.wikipedia.org/wiki/Column-oriented_DBMS
Entity-Attribute-Value model
http://en.wikipedia.org/wiki/Entity-Attribute-Value_model
Read Consistency: Dumb Databases, Smart Services
http://blog.labnotes.org/2007/09/20/read-consistency-dumb-databases-smart-services/
Neo4j graph database
http://neo4j.org/
NoSql web site - "Your Ultimate Guide to the Non-Relational Universe"
http://nosql-database.org/

Document-Oriented Databases
Document-Oriented Database
http://en.wikipedia.org/wiki/Document-oriented_database
Apache CouchDB
http://couchdb.apache.org/
Why CouchDB?
http://pmuellr.blogspot.com/2008/01/why-couchdb.html
Why CouchDB Sucks
http://www.eﬂorenzano.com/blog/post/why-couchdb-sucks/
Damien Katz CouchDB Interview
http://www.infoq.com/news/2008/11/CouchDB-Damien-Katz
CouchDB:Thinking beyond the RDBMS
http://blog.labnotes.org/2007/09/02/couchdb-thinking-beyond-the-rdbms/
CouchDB Implementation
http://horicky.blogspot.com/2008/10/couchdb-implementation.html
Dare Takes a Look at CouchDB
http://intertwingly.net/blog/2007/09/12/Dare-Takes-a-Look-at-CouchDB

Document-Oriented Databases
CouchDB - A Use Case
http://kore-nordmann.de/blog/couchdb_a_use_case.html
Amazon SimpleDB
http://aws.amazon.com/simpledb/
http://en.wikipedia.org/wiki/SimpleDB
thrudb - Document Oriented Database Services
http://code.google.com/p/thrudb/
thrudb - faster, cheaper than SimpleDB
http://www.igvita.com/2007/12/28/thrudb-faster-and-cheaper-than-simpledb/
QCon 2008 track on Document-Oriented Distributed Databases
http://qconsf.com/sf2008/tracks/show_track.jsp?trackOID=170

Distributed K-V Stores
Amazon's Dynamo
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
http://www.allthingsdistributed.com/ﬁles/amazon-dynamo-sosp2007.pdf
Anti-RDBMS:A list of distributed key-value stores
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/
http://www.reddit.com/r/programming/comments/7qv19/antirdbms_a_list_of_distributed_keyvalue_stores/
Is the Relational Database Doomed?
http://developers.slashdot.org/comments.pl?sid=1127539&cid=26849641
ProjectVoldemort
http://project-voldemort.com/
ProjectVoldemort design (also see excellent list of references from this page)
http://project-voldemort.com/design.php
Consistent Hashing
http://en.wikipedia.org/wiki/Consistent_hashing

Bigtable / HBase
Google Architecture
http://highscalability.com/google-architecturehttp://highscalability.com/google-architecture
Bigtable:A Distributed Storage System for Structured Data
http://en.wikipedia.org/wiki/BigTable
http://labs.google.com/papers/bigtable-osdi06.pdf
Apache HBase
http://hadoop.apache.org/hbase/
http://en.wikipedia.org/wiki/HBase
Apache Hadoop
http://hadoop.apache.org/
Understanding HBase and BigTable
http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable
Matching Impedance:When to use HBase
http://blog.rapleaf.com/dev/?p=26
HBase Leads Discuss Hadoop, BigTable and Distributed Databases
http://www.infoq.com/news/2008/04/hbase-interview
Hadoop/HBase vs RDBMS
http://www.docstoc.com/docs/2996433/Hadoop-and-HBase-vs-RDBMS

scott.leberknight@nearinfinity.com
www.nearinfinity.com/blogs/
twitter: sleberknight

Polyglot Persistence

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to Polyglot Persistence

Similar to Polyglot Persistence (20)

More from Scott Leberknight

More from Scott Leberknight (6)

Recently uploaded

Recently uploaded (20)

Polyglot Persistence