Some scalable systems Google ~ BigTable Amazon ~ Dynamo ~ SimpleDB Microsoft ~Powerset ~ Bing ~ Dynomite Twitter ~ Hadoop ~ Pig Facbook ~ Digg ~ Cassandra ~ Thrift Nasdaq ~ tin ~ text & filesystem Akamai ~ Riak Ubuntu ~ LHC ~ BBC ~ CouchDB Linkedin ~ Gilt ~ Voldemort Business Insider ~ MongoDB Stuff built in Erlangby guys with physics degrees
How they define scalable If I add Xresources, then I gain Xperformance. If I double my nodes (servers), then I should get double the computing power. If I double my processors, then the processing should take half as long to do. If I double my network bandwidth, then I should be able to transmit twice as fast or twice as much data. If we double the amount of developers, then we should get twice the amount of work done.
Some chatter dump No… SQL, ORMs, Schemas, Joins, Foreign Keys, Transactions, ACID, RDBMS Distributed Key/Value Stores ~ Document-oriented Database ~ MapReduce Functional Languages ~ Erlang ~ F# ~ No OO RESTful ~ JSON ~ BSON ~ HTTP Horizontal vs. Vertical Scaling Google Bigtable Paper Dynamo Amazon Paper CAP Theorem (Consistency, Availability, Partition Tolerance) ~ Only 2 @ a time. BASE ~ Eventually Consistent for High Availability ~ DNS SLA ~ Number of 9s Code for Failure ~ Fault-tolerance ~ Graceful Degradation SN (Shared Nothing) Architecture ~ No bottlenecks Sharding~ Horizontal Partitioning Distributed Map ~ Consistent Hashing (Ring of Nodes) Sloppy Quorum ~ Minimum Nodes for R/W Hinted Handoff ~ Always Writeable ~ Handles Temp failures Merkle Tree Replication ~ Handles Permanent Failures Fault-tolerance ~ Read-Repair ~ Replication Vector Clocks (node, counter) ~ No Wall Clocks SuperColumns ~ ColumnFamily Stateless App Servers ~ P2P Bootstrapping CDN (Content Delivery Network) MVCC (Multiversion Concurrency Control) ~ B-tree ~ Tail Appends ~ Cluster Rebalancing
Some popular reads (Brewer’s CAP theorem) Towards a Robust Distributed Systems http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf (Google) Bigtable: A Distributed Storage System for Structured Data http://labs.google.com/papers/bigtable-osdi06.pdf Dynamo: Amazon’s Highly Available Key-value Store http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
I threw a few slides together based on my notes fro more
I threw a few slides together based on my notes from the conference I attended last week called NoSQLEast. There were some very interesting presentations on how people deal with developing and maintaining massive systems. less
0 comments
Post a comment