Some Open Problems in Publish/Subscribe Networking (keynote talk at DEBS 2003)


Published on

Keynote talk at the 2nd International Workshop on Distributed Event-Based Systems (DEBS 2003), 8 June 2003.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Some Open Problems in Publish/Subscribe Networking (keynote talk at DEBS 2003)

  1. 1. Some Open Problems inPublish/Subscribe Networking David S. Rosenblum Chief Technology Officer PreCache Inc.
  2. 2. Acknowledgments Alexander L. Wolf  University of Colorado at Boulder Antonio Carzaniga  University of Colorado at Boulder PreCache Engineering Team
  3. 3. Background
  4. 4. Information-CentricInternet Applications Software and Antivirus Updates Consumer Alerts Location-Based Services for Mobile Wireless Multiplayer Online Games Web Search Engines e-Business (e.g., Supply Chain Mgmt) Distributed Sensor Networks Publish/subscribe is a natural fit! Publish/subscribe is a natural fit!
  5. 5. Publish/Subscribe Networking Publish/subscribe is traditionally implemented by centralized servers But server-based realizations do not scale to Internet-wide applications So existing networks require “faking it”  Request/response interaction  Continual subscriber polling  Enormous server farms  Dumb caching And so we must realize publish/subscribe via a distributed network of routers
  6. 6. SIENA Content-Based Routing Subscription Forwarding s1:1 s1:1 s1a s1:a s1:a 2 1 s1:2 s1:2 s1:2 s1:2 3 5 s1:1 s1:1 4 s1:3 s1:3 6 s1:3 s1:3 7 8 s1:5 s1:5 s1:6 s1:6 9
  7. 7. SIENA Content-Based Routing Subscription Mergings1 covers s2 s1:1 s1:1 s1:1 s1:1 s2:5 s2:5 s1 covers s2 s1:a s1:aa s1:a s1:a 2 1 s2:2 s2:2 s1:2 s1:2 s1:2 s1:2 s1:2 s1:2 s2:8 3 s2:8 5 s1:1 s1:1 4 s1:3 s1:3 6 s1:3 s1:3 7 s2 s1:5 s1:5 8b s1:5 s1:5 s2:b s2:b 9 s1:6 s1:6
  8. 8. SIENA Content-Based Routing Notification Delivery s1:1 s1:1 n1 matches s1 s2:5 s2:5 s1:a s1:a n1 matches s2a 2 1 s2:2 s2:2 s1:2 s1:2 s1:2 s1:2 s2:8 s2:8 3 5 s1:1 s1:1 4 s1:3 s1:3 6 s1:3 s1:3 7 n1 s1:5 s1:5 8b s2:b s2:b 9 s1:6 s1:6
  9. 9. PreCache NETINJECTOR Architecture InternetPublisher Subscriber = Event Agent = Routing Engine = Channel Manager
  10. 10. PreCache NETINJECTOR Routing and forwarding based on SIENA  Generalize idea of subscription merging  Compute single subscription covering all received subscriptions  Employ approximate matching  Constant time and space complexity  Log time and space with additional leakage reduction Channel services  Namespace management  Resource allocation  Load balancing, fault tolerance, authentication
  11. 11. Open Problems
  12. 12. Comments on the Problems Problems identified based on experience with NETINJECTOR and SIENA Many of the problems arise because of a desire for scalability Some problems are deeply technical Other problems are simply pragmatic
  13. 13. Problem Wireless Mobile Devices (WMDs?) s1:1 s1:1 s1:a s1:aa 2 1 s1:2 s1:2 s1:2 s1:2 3 5 s1:1 s1:1 4 s1:3 s1:3 6 s1:3 s1:3 7 8a s1:5 s1:5 9 s1:6 s1:6
  14. 14. ProblemIssues with Wireless Mobile Devices Caching notifications in the network Stream reconstruction and duplicate suppression Frequency of movement versus overhead of reconfiguration Gateways for email, SMS, etc.
  15. 15. ProblemSecurity Traditional security properties are address-based  Example: Authentication  Bob wants to make sure Alice sent the message  Content-based analog  Bob wants to make sure a message represents reality Pub/sub admits new kinds of vulnerabilities  Example: Denial of Service  Highly generic subscription (“Price > 0”) causes flood of notifications to subscriber  How do you distinguish a malicious subscriber from a greedy subscriber? How do you do content-based routing when the content is encrypted???
  16. 16. ProblemClient Connections and Firewalls Want constant connection between subscriber and edge router  Otherwise subscriber polls for notifications  Connections limits may require multiplexing Client must initiate connection to edge router in order to breach firewall And if port 80 is the only open port …  Need HTTP encapsulation of messages  May need HTML formatting of messages  Routers need to multiplex and/or demultiplex message traffic
  17. 17. ProblemApproximate Matching Rationale: High-performance routing  Expect approximate matching to have better time/space complexity than exact matching Approximation must be conservative  False positives OK, false negatives not  Must still perform exact match at some point before delivery to subscriber Leakage may increase traffic  Tradeoff in computational resources We need simulation tools to explore this!
  18. 18. ProblemOptimizing for Traffic Variations Can routers dynamically optimize for traffic variations? Example: The Brittany Spears Effect  All subscribers want certain notifications N1  Few subscribers want other notifications N2  N1 notifications may flood network Example: The Google Effect  Certain subscribers S1 want all notifications  Other subscribers S2 want few notifications  S1 subscriptions may dominate routing We need simulation tools to explore this!
  19. 19. ProblemService-Provider Deployment Difficult to convince network service providers to enhance their networks with publish/subscribe  Application demand not yet critical  Lack of standards Economic barriers govern router design  Example: 100M users, $10K/router  1000 users/router: 100K routers, $1G outlay  100 routers: $1M outlay, 1M users/router
  20. 20. ProblemPeer-to-Peer Deployment Reasonable alternative to service-provider deployment  “Grass roots” generation of demand Challenges  Dynamically aligning peer topology to underlying network topology  Dynamically partitioning routing responsibilities across peers  Ensuring reliability, privacy and/or integrity of messages
  21. 21. ProblemUnicast Fanout at Edge Routers Example: 100M users on 1K routers  100K users per router  10Kbyte notification  >80 milliseconds over OC-192  >80 seconds over 10Mb Ethernet  >4 hours over 56K modem Idea: Use publish/subscribe for “leveling”  Partition users into classes  Example: Last digit of serial number  Publish once per class  Tune publication rate to available bandwidth and SLA
  22. 22. Conclusion
  23. 23. Conclusion Many Internet applications naturally require publish/subscribe messaging Scalability can be achieved through publish/subscribe networking SIENA, PreCache, and others have established many fundamental results But many open problems remain to be solved