Rnotify

426 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
426
On SlideShare
0
From Embeds
0
Number of Embeds
116
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Rnotify

  1. 1. RnotifyA Scalable Distributedfilesystems NotificationsSolution for ApplicationsAshwin Raghavwww.rnotifications.comgithub.com/ashwinraghav/rnotify-c/11Tuesday, April 30, 13
  2. 2. Agenda• Motivation• Problem Statement / State of the art• General Overview• Hypothesis• Approach• Evaluation• Conclusion22Tuesday, April 30, 13
  3. 3. Motivation• Applications need File SystemNotifications• Previously applications polledfile systems naively• Now,All Operating Systemsprovide FS Notifications API33Tuesday, April 30, 13
  4. 4.   ProblemVFS is an abstractionto treat all filesystemsuniformlyAll FS read/writeshappen viaVFS - idealplace to implementnotificationsDoes not workwith DistributedFile Systems44Tuesday, April 30, 13
  5. 5. Problems / State of the artUse ad-hoc (polling) implementations for Distributed FS.Polling creates an unfortunate tension betweenresource consumption and timelinessAny general solution must be location transparent,scalable, tunable.Use inotify to subscribe to local filesystems55Tuesday, April 30, 13
  6. 6. Requirements• Compatibility with existing applications that use Inotify• Provide Horizontal Scalability, Decomposition of Functionality,Tunable Performance• Location Transparency• High Throughput notifications per client66Tuesday, April 30, 13
  7. 7. Assumptions• Relaxing Reliability Guarantees• Modifying Notification Semantics• Congestion Control Semantics• Failure Notification Semantics77Tuesday, April 30, 13
  8. 8. Related Work• FAM (File Alteration Monitor) - does not scale• Internet scale systems like Thialfi, Zookeeper are built for larger scalesof clients.• Bayeux, Scribe, Siena, Hermes, Swag etc assume overlay networks toestablish multicast trees for message dissemination• Inotify was introduced in kernel 2.6.13 - for local FS notifications88Tuesday, April 30, 13
  9. 9. OverviewMultiplexing/ProxyingSubscriptionsSerializingNotificationsDemultiplexingNotifications99Tuesday, April 30, 13
  10. 10. HypothesisAs a result of clearly decomposing functionality intoreplicable components, Rnotify can be tuned to fit differentnotification workloads to consistently deliver notificationsat low latency.1010Tuesday, April 30, 13
  11. 11. Key Properties• Low Latency Notifications (under 10ms)• Compatible with applications that use Inotify• Tuned to fit workloads• Greedy Applications can use Rnotify by distributing theirworkloads across nodes.1111Tuesday, April 30, 13
  12. 12. Approach• Registration• Notification• Replica Configuration Management1212Tuesday, April 30, 13
  13. 13. Registration• Inform the Proxy about the newly watched file• Place Registrations on preferred Publishers1313Tuesday, April 30, 13
  14. 14.    • Client Driven Registration• Registration is transactionalfrom the application ‘s pointof view• Client Driven Migration ofsubscriptionsClient Library & API usage1414Tuesday, April 30, 13
  15. 15. Client Library & API usage1515Tuesday, April 30, 13
  16. 16. Notification Pipeline• Congestion Control• Opportunistic Batching• Publisher Selection1616Tuesday, April 30, 13
  17. 17.  Dispatchers• Serialize notification blocks• Congestion Control• Dispatch to Publisher1717Tuesday, April 30, 13
  18. 18. Congestion Control atDispatcherSubscription Id Number of notifications inTime window1 10002 3000Frequency ListFrequency ListFrequency ListNOTIFICATION_BURST issent to Publisher1818Tuesday, April 30, 13
  19. 19. Avoid atomic broadcastsFrequency ListFrequency ListFrequency ListFrequency List1919Tuesday, April 30, 13
  20. 20. Publishers• Identify the subscribers for anotification• Dispatch to the subscribers2020Tuesday, April 30, 13
  21. 21. Representing State - PublisherGet allSubscribersGet allNotificationsFile Id IP address of Subscribers1 192.168.1.2:3000192.168.3.4:30012 192.168.1.2:3000192.168.3.4:3001Subscriber Undelivered Notifications192.168.1.2:3000 N1, N2, N3192.168.3.4:3001 N4, N5, N6File Id Notifications1 N1, N2, N3,2 N4, N5Append newNotification2121Tuesday, April 30, 13
  22. 22. Publisher SelectionHow do the dispatchers and Registrar maintain a sharedunderstanding of ‘preferred’ publishers?2222Tuesday, April 30, 13
  23. 23. Partition and Placement of Publisherspos3 = SHA1(Publisher3_IP_ADDR)pos4 = SHA1(Publisher4_IP_ADDR)pos2 = SHA1(Publisher2_IP_ADDR)pos1 = SHA1(Publisher1_IP_ADDR)2323Tuesday, April 30, 13
  24. 24. Partition and Placement of Subscriptionsfile3 = SHA1(File_Path3)file4 = SHA1(File_Path4)file2 = SHA1(File_Path2)file1 = SHA1( File_Path1)2424Tuesday, April 30, 13
  25. 25. Arrival of Publishernew_publisher = SHA1(New_Pub_IP_Addr)Reissue_registrations_between(pos1, pos2)Lock free way to make configuration eventually consistent2525Tuesday, April 30, 13
  26. 26. Dispatcher Replication• Dispatcher is provided the registrar location at startup• It acquires the publisher list from the registrartransactionally.• Inform the Proxies independently2626Tuesday, April 30, 13
  27. 27. Evaluation StrategyMid size GlusterFSdeployment on EC2Postmark Benchmarkto represent FS activityUsing Chef to startupserviced clientsMeasure Latency endto end8xl machines with 32 cores eachhelped simulate several clients eachAll machines wereacquired within aplacement group2727Tuesday, April 30, 13
  28. 28. Evaluation - ScalabilityTune Dispatchers based on FS throughputTune Publishers based on number of clients2828Tuesday, April 30, 13
  29. 29. Scalability - Overactive FileSystemsPost Mark threads writing to differentdirectories2929Tuesday, April 30, 13
  30. 30. Scalability - Overactive FileSystemsPostMark threads writing to same directory3030Tuesday, April 30, 13
  31. 31. PostMark threadswriting to differentfilesPostMarkthreads writingto same filesApplications likeweb/mail serverHPCapplicationsScalability - Overactive FileSystems3131Tuesday, April 30, 13
  32. 32. Scalability - Servicing many clients3232Tuesday, April 30, 13
  33. 33. PerformanceDemonstrate consistencyDemonstrate footprint in comparisonto naive polling3333Tuesday, April 30, 13
  34. 34. Performance - Consistency3434Tuesday, April 30, 13
  35. 35. Comparison to naive Polling• Developed a pollerNode.js REST API• For just 100 clients , 5files, 50000 stats persecond• Has an extremely heavyfootprint on the FSperformance3535Tuesday, April 30, 13
  36. 36. Greedy Applications• Increasing the number ofnotifications deliveredper client• Linear increase in latency• Messages spend moretime in queues3636Tuesday, April 30, 13
  37. 37. Inotify - Inefficient Applications3737Tuesday, April 30, 13
  38. 38. Greedy ApplicationsIf you need to consumemore notifications,Distribute yourselfInefficientApplication3838Tuesday, April 30, 13
  39. 39. Summary - Why is thiswork different?• FAM does not scale and is obsolete.• All PubSub systems do not cater to many notifications perclient• Multicast Trees are established for reliability (Performancesuffers)• Pub Sub systems provide a richer set of semantics with lowerperformance3939Tuesday, April 30, 13
  40. 40. Future Work• Introduce a security model• Introduce message ordering• Provide message delivery reliability4040Tuesday, April 30, 13
  41. 41. Conclusion• Rnotify is a solution to receive notifications from POSIXcompliant Distributed File Systems• Tuned to fit different notification workloads• Incrementally Scalable, location transparent and mimics Inotify• We have tested Rnotify to scale to 2.5 million notifications persecond• Latency under 10ms for 88% notifications4141Tuesday, April 30, 13
  42. 42. Questions4242Tuesday, April 30, 13
  43. 43.  Subscription Proxy• Resides on the File Host &Proxies subscriptions &notifications.• Idempotent API wrappers forsubscription4343Tuesday, April 30, 13
  44. 44. Design Alternatives• File System Modification• VFS Modification• Modifying Inotify Implementation4444Tuesday, April 30, 13
  45. 45. Latency Tests - Zero4545Tuesday, April 30, 13
  46. 46. Throughput Tests - Zero4646Tuesday, April 30, 13

×