I cannot cover                                           Distributed Systems                                              ...
What is a Distributed System?              "A distributed system is             one on which I cannot get              any...
What is a Distributed System?“A system in which hardware or   “A distributed system is asoftware components located       ...
Characteristics and Challenges• No Global Clock                                        • Fault                            ...
Fallacies of Distributed Systems•   The network is reliable.                  • There is one•   Latency is zero.          ...
Why Distributed Systems•   Need to build bigger systems•   Many usecases are inherently distributed•   To avoid failures• ...
A System Usecase Classification• Processing Data  (Moving vs. Stored  Data)• Servers: Receive,  Process, and Respond• Runn...
Usecase: Processing Data: React to Sensors • Many sensors: Weather, Travel, Traffic, Surveillance, Stock   exchange, Smart...
Usecase: Processing Data: Target Marketing• Receive data about users continuously: e.g. web  clicks, what they brought, wh...
Usecase: Receive, Process, and Respond:          Online Store (e.g. Amazon)                                       • Many S...
Usecase: Running User Provided Jobs :            SETI@Home• Many people volunteer  their computing power• Scientists submi...
Usecase: Data Storages and Provenance             (Sky Server)                                                       • Tel...
Mobile Sensor Crowdsourcing                                                                     • Mobile phones are now li...
Great! lets see what     Distributed Systemtechnologies have made these     use cases possible!!
Distributed Systems Timeline/HistoryPeriod          Topics1965-late 70s   Parallel Programming, Self Stabilization, Fault ...
Theoretical Computer Science                                      • Concerns with                                         ...
Communication Protocols• Request/Response  – RMI, CORBA, REST/HTTP,    WS, Thrift• Publish/Subscribe• Distributed Queues• ...
Request/Response and Architectural              Styles• Message formats  • RMI, CORBA, REST/HTTP, Web Service, Thrift• Arc...
Known Distributed Architecture               Patterns• LB + Shared nothing Nodes• LB + Stateless Nodes + Scalable  Storage...
LB + Shared Nothing and 3-Tier• Most common scaling pattern• Most architectures follows this model
Storages• Single Database• Replicated Databases• Parallel Databases  (Sharding)• NewSQL (In-  Memory, sharding .. Highly  ...
Building Scalable Systems• Single Machine• Shared Memory  Model• Clustering (State  Replication  through group  communicat...
Publish Subscribe and EDA• Many publishers send events• Subscribers register events, and a  publish/subscribe network matc...
Cloud Computing• Ability to buy computations  power, storage, or execution  services as an Utility, on demand.• Best way t...
Where do go from here?
If You Plan to Learn about Distributed                   Systems• One of the fields to learn by  doing• You have to be a g...
Distributed System Community•   Based around ACM, IEEE, and USENIX•   Well known journals     – IBM System journal, ACM Op...
Few Must Read Papers•   System Structure for Software Fault Tolerance (1975)•   Reaching Agreement in the Presence of Faul...
Some Open Challenges• Every thing Data: Analytics, AI,  Data Mining (Distributed  versions of many algorithms)• Complex Ev...
Questions?Copyright by romainguy, and licensed for reuse under CC License    http://www.flickr.com/photos/romainguy/249370...
Keynote for CSE conference 2011: Distributed Systems: What?  Why? And bit of How?
Upcoming SlideShare
Loading in...5
×

Keynote for CSE conference 2011: Distributed Systems: What? Why? And bit of How?

6,679

Published on

Did this as the keynote for CSE conference 2011, University of Moratuwa

Published in: Technology, Sports
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
6,679
On Slideshare
0
From Embeds
0
Number of Embeds
26
Actions
Shares
0
Downloads
36
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Keynote for CSE conference 2011: Distributed Systems: What? Why? And bit of How?

  1. 1. I cannot cover Distributed Systems in 30 minutes!But, I can tell whyyou might want to learn Distributed Systems in 30 minutes! http://www.flickr.com/photos/uwehermann/82753155/sizes/m/in/photostream/ and http://www.flickr.com/photos/peterpearson/5921765552, licensed under CC
  2. 2. What is a Distributed System? "A distributed system is one on which I cannot get any work done because some machine I have never heard of has crashed.“ --Leslie Lamport
  3. 3. What is a Distributed System?“A system in which hardware or “A distributed system is asoftware components located collection of independentat networked computerscommunicate and coordinate computers that appear to thetheir actions only by message users of the system as a singlepassing.” - [Coulouris] coherent system.” - [Tanenbaum]
  4. 4. Characteristics and Challenges• No Global Clock • Fault Tolerance• Communication • Scale only by message • Transparenc Passing• No Global State• Independent Failures Photo by John Trainoron Flickr http://www.flickr.com/photos/trainor/2902023575/, Licensed u
  5. 5. Fallacies of Distributed Systems• The network is reliable. • There is one• Latency is zero. administrator.• Bandwidth is infinite. • Transport cost is zero.• The network is secure. • The network is• Topology doesnt change. homogeneous. http://www.flickr.com/photos/12587661@N06/2300406685, @Michael Gwyther-Jones, L
  6. 6. Why Distributed Systems• Need to build bigger systems• Many usecases are inherently distributed• To avoid failures• Omnipresence – if you buy food from a super market – If you buy a book from a Bookshop Chain – If you search in the Web – If you use a GPS navigator – If you turn on your My 10 list – If you pay a bill – If you use your mobile App
  7. 7. A System Usecase Classification• Processing Data (Moving vs. Stored Data)• Servers: Receive, Process, and Respond• Running User provided Jobs• Data Storages and Provenance http://www.flickr.com/photos/kelsea-groves/5535666329/
  8. 8. Usecase: Processing Data: React to Sensors • Many sensors: Weather, Travel, Traffic, Surveillance, Stock exchange, Smart Grid, Production line • Monitor, understand, and react to events • Usually handled with CEP (e.g. Esper, Stream Base, Siddhi) or Stream Processing (S4, Twitter Stream) http://www.flickr.com/photos/imuttoo/4257813689/ by IanMuttoo, http://www.flickr.com/photos/eastcapital/4554220770/, http://www.flickr.com/photos/patdavid/4619331472/ by Pat David copyright CC
  9. 9. Usecase: Processing Data: Target Marketing• Receive data about users continuously: e.g. web clicks, what they brought, what they liked and do not like, what their friends like and brought• Build models, index information in the background• Send him advertisements that best matches his preferences – have to do this quickly – in few (say 50) milliseconds• Cloud be the next billion dollar problem
  10. 10. Usecase: Receive, Process, and Respond: Online Store (e.g. Amazon) • Many Sellers selling many items and Many Byers • List of all items, with their specs • Index items by many dimensions and support search• Support checkout, track the delivery, returns, ratings, and complains• Supported by partitioning sellers/ items across many nodes
  11. 11. Usecase: Running User Provided Jobs : SETI@Home• Many people volunteer their computing power• Scientists submit computing jobs to the system• Broker and match resources with jobs, run them and return results. Handle failures. Avoid free riding.• Considered biggest computer in earth (505 TFLOPS, 150k active computers) http://www.elfwood.com/~axthony/Staring-Aliens.2552052.html, Licensed CC
  12. 12. Usecase: Data Storages and Provenance (Sky Server) • Telescopes (Square Kilometer Array) keep collecting data from the sky (Tera bytes per day) • Sky Server let scientists to come and see the sky of a given location, as seen at a given time. • Moving data takes long time. 1TB takes – 100 Mbps network : 30 hrs – 1 Gbps network : 3 hrs – 10 Gbps network : 20 minutes • Given a data item, need to track how it is created, equipment accuracy, transformations used http://www.fotopedia.com/items/flickr-518876976 and etc.http://www.geograph.org.uk/photo/103069, Licensed CC
  13. 13. Mobile Sensor Crowdsourcing • Mobile phones are now like a weather center: has – a barometer – temperature sensor – proximity sensor – GPS – moisture sensor • Get volunteer phones to send sensor data (Crowd source). – report on weather – crop diseases (agriculture officials) – epidemics (from hospitals, doctors) • Use that to do weather predications, crop disease and http://www.fotopedia.com/items/flickr-2548697541 , epidemic spread http://www.geograph.org.uk/photo/1534209, andhttp://www.yourbdnews.com/2011/10/17/samsung-files-to-halt-iphone- • Moving Sensors (Polar Grid) 4s-in-japan-australia/iphone-4s, Licensed CC
  14. 14. Great! lets see what Distributed Systemtechnologies have made these use cases possible!!
  15. 15. Distributed Systems Timeline/HistoryPeriod Topics1965-late 70s Parallel Programming, Self Stabilization, Fault Tolerance, ER Model/ Transactions, Time Clock1980s Consensus and impossibility, SQL, Distributed Snapshots, Replications, Group CommunicationEarly 90s Linearizability, Parallel DB, transactional Memory, RAID, MPILate 90s Volunteer Computing, P2P file sharing, Complex event processingEarly 2000 Oceanostore, Web Services, Symantec Web, REST, DHT, Pub/Sub, Grid, Autonomic Computing, Google File System, Virtualization, SOA, Map reduce2005-2010 Cloud, NoSQL, Mobile Apps, Data Provenance
  16. 16. Theoretical Computer Science • Concerns with – Coordination algorithms: Leader Election, multi-cast, distributed locks, barriers, snapshot algorithms – Impossibility results, upper and lower bounds – Distributed versions of some centralized algorithms (e.g. shortest path) – Lot of work done on 70s, and layed the ground work for Distributed Systems http://www.flickr.com/photos/lodz_na_nowo/5690492370/ http://xkcd.com/384/ http://www.flickr.com/photos/quinnanya/4990131194/sizes/z/in/photostream/ , Licensed CC
  17. 17. Communication Protocols• Request/Response – RMI, CORBA, REST/HTTP, WS, Thrift• Publish/Subscribe• Distributed Queues• DHT (Distributed Hash Tables)• Gossip/ Epidemic Protocols• Whiteboards http://www.flickr.com/photos/novecentino/2596898279/, Licensed CC
  18. 18. Request/Response and Architectural Styles• Message formats • RMI, CORBA, REST/HTTP, Web Service, Thrift• Architectural Styles – Remote Procedure Calls (RPC) – Distributed Objects – Service Oriented Architecture (SOA) – Resource Oriented Architecture (ROA)
  19. 19. Known Distributed Architecture Patterns• LB + Shared nothing Nodes• LB + Stateless Nodes + Scalable Storage• DHT (Distributed Hash Table)• Distributed queues• Publish/Subscribe Broker Network• Gossip architectures + biology inspired algorithms• Map reduce/ data flows• Stream processing• Tree of responsibility
  20. 20. LB + Shared Nothing and 3-Tier• Most common scaling pattern• Most architectures follows this model
  21. 21. Storages• Single Database• Replicated Databases• Parallel Databases (Sharding)• NewSQL (In- Memory, sharding .. Highly optimized)• NoSQL (Column Family, Key Value pair, Document)
  22. 22. Building Scalable Systems• Single Machine• Shared Memory Model• Clustering (State Replication through group communication)• Shard Nothing• Loose Consistency with Shared nothing http://www.fotopedia.com/items/louromig-8P4w6xtSgbY, Licensed CC
  23. 23. Publish Subscribe and EDA• Many publishers send events• Subscribers register events, and a publish/subscribe network match and redirect events• Have scalable implementations• Basis for event driven architectures
  24. 24. Cloud Computing• Ability to buy computations power, storage, or execution services as an Utility, on demand.• Best way to explain it is by comparing it to Electricity• Idea is a big pool of servers and share. • Economics of scale through Optimize large scale operations. • Resource Pooling. • No need for capacity planning, start small and grow as needed. • Outsource and enabling specialization. photo by LoopZilla on Flickr, http://www.flickr.com/photos/loopzilla/2328231843/sizes/m/in/photostre
  25. 25. Where do go from here?
  26. 26. If You Plan to Learn about Distributed Systems• One of the fields to learn by doing• You have to be a good programmer – a patient one (Debugging) – Lazy one (but intelligent)• Start by writing some Web Services, request response stuff• Stop reinventing the wheel, start using tools (middleware)• Learn Zookeeper• Take a class – read, write code, debug, .. http://www.flickr.com/photos/mariachily/5250487136, Licensed CC
  27. 27. Distributed System Community• Based around ACM, IEEE, and USENIX• Well known journals – IBM System journal, ACM Operating Systems Review, ACM Transactions on Computer Systems, IEEE Distributed Systems Online, IEEE Transactions on Parallel and Distributed Systems• Conferences – Theory: ICDCS, SPDC – SOA/Cloud : ICWS – E-Science, Parallel Programming : HPDC, SC, E- Science, Ccgrid – Systems : USENIX, Middleware, ACM Symposium on Operating Systems Principles, FAST, LISA, OSDI – DB : Sigmoid record, VLDB• Awards – Turing Award – Edsger W. Dijkstra Prize in Distributed Computing http://www.flickr.com/photos/dullhunk/4187914071, http://www.foto pedia.com/items/flickr-1544709148, Licensed CC
  28. 28. Few Must Read Papers• System Structure for Software Fault Tolerance (1975)• Reaching Agreement in the Presence of Faults (1980)• Time, Clocks, and the Ordering of Events in a Distributed System (1978)• Reaching agreement in the presence of faults(1980) and The Byzantine generals problem” (1982),• End-to-End Arguments in System Design (1984)• A Note on Distributed Computing (1994)• Scale in Distributed Systems, (1994)• The Google File System (2003)• Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, (2001)• The Google file system (2003)• Xen and the Art of virtualization (2003)• MapReduce: Simplified Data Processing on Large Clusters (2004)
  29. 29. Some Open Challenges• Every thing Data: Analytics, AI, Data Mining (Distributed versions of many algorithms)• Complex Event Processing (CEP)• How to Scale?• Middleware for the Cloud• Scalable Storage• Provenance• Workflows• Guard against DDoS and other http://www.flickr.com/photos/brianscott/5474210001, Distributed Security Issues Licensed CC
  30. 30. Questions?Copyright by romainguy, and licensed for reuse under CC License http://www.flickr.com/photos/romainguy/249370084
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×