13. Relational Model
Search by content
Good for query
Demands on processor
Rigid, fixed structures
Bad for modeling
http://www.flickr.com/photos/35536700@N07/3292544674
20. Can you do it with a
relational database?
Is your DB falling apart?
What do you need?
21. Where RDBMS Fall Apart
Scaling
SPoF
Sharding
Denormalizing
Availability
Slave Systems http://www.flickr.com/photos/horiavarlan/4681206711/
22. What do you need?
Reduced cost
Throughput
Availability
Recoverability
Correctness
Transactions
http://www.flickr.com/photos/ell-r-brown/5866777592/
Flexible Schema
23. What is NoSQL?
Flight
http://www.flickr.com/photos/24277960@N08/2609390563/
http://www.flickr.com/photos/taylar/4996955547/
http://www.flickr.com/photos/gromgull/611019520/
http://www.flickr.com/photos/igboo/2583174998/
24. What isn’t NoSQL?
NoFlight
http://www.flickr.com/photos/alanvernon/3121751152/
http://www.flickr.com/photos/tomsaint/3209482579/
http://www.flickr.com/photos/pointnshoot/408384715/
http://www.flickr.com/photos/zigazou76/5846255426/
26. Considerations
Data Model
Query/Search model
Transactional Semantics
Read vs Write Throughput
Deployment/Management
27. Focus on a few
systems
MongoDB
Master-Slave
Redis
FullyDistributed
Riak
HBase
Cassandra http://www.flickr.com/photos/seier/2455551478/sizes/l/in/photostream/
32. MongoDB
Master-slave replication
Asynchronous
Gives failover & data redundancy
But not consistency
Only master can receive writes
Makes atomic writes easy
33. Redis
Real-time stats
tracking
Wicked fast
Collections built in
TraverselinksFollow pointersNo notion of keysJust data
Up and down
Up down left right
Edgar F CoddDominant by 90s
Emphasize search, not navigationForeign keys are a bad model.Relationships not explicit.E-R diagrams not until mid to late 70s.
Rackspace example
Google File System 2003BigTable 2004
Answer: Should I?Temptation – new startup makes a blog post saying “we like it.”HypeThis is Hawt! I should be using it.New shiny
Fads aside…Mistakes not evident at first.
Answer: Should I?Two questionsHypeNew shiny
Fixed table spaceNotlinear – 2x space != 2x money.
Relational impedance
Datamodel is complexQueries are represented as JSON
Datamodel is complexQueries are represented as JSON
Does do sharding
Like a memcache for lists and setsLive dataFast changing
Snapshots - leave delta for data lossMaster/Slave - asynchronous replication
Faithful Dynamo CloneMapReduce != Hadoop integrationHooks == BigTable Coprocessors
Bitscask -> small data set (keys must fit in memory)InnoDB -> big data setMemory -> duhREST – easy for programmersBalancing – always 64 pieces.
Sorted – poor scanning performance
Dyanamo + BigTable
Balancing –RegionServers + HDFS
Will be more choices and better solutions: 205 Million Dollars of Funding For Big Data Startups (http://datascience101.wordpress.com/2012/02/28/funding-for-big-data-startups/)Accel PartnersIA Ventures