The bigrabbit


Published on

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Built as story Please comment and ask during the presentationIntegrasco a social media analytics firm that specializes in mobile telecoStore and index data in a social media search system22.6796185 kg
  • Human kind have an tendency to jump on the most popular wagonResturantsQueuesStocksSheep mentality,
  • If you recall this autumn: Steve Jobse: 5. oktIphone 4s: October 4, 2011 / October 14, 2011Gadaffi: 20 October 2011
  • It never rain, but it poursBlack Berry Outtakes : Monday 10th October - Monday 17th October  RIM acknowledges ongoing email and messaging problem for customers in Europe, Middle East and Africa5-21… 30 million tweets = Normal 12 million
  • Legacy Javasystem built and rebuilt continously over many yearsMySQLHibernateSOAP (CXF)
  • Supported by a cutting edge HW architectureHW from HP
  • Twitter US heavy. Most active : 23:00 CET -> Can vary from 100 – 25 000 m/mRelated to fluctuations in traffic30 – 100 million
  • Simple and good timer utilsHave a logWarn the management
  • - Dependency between components was too high- Unpredictable memory/CPU consumption- Clients struggled with SOAP Difficult to scale up jetty, tomcat etcDifficult to restart the components
  • To me it looks like an message queuesolution
  • Do not repeat yourself -> Do not repeat what others do betterKafka – LinkedinKestrel – Twitter, Facebook (Scala)Digg - RabbitMQPintrest - RabbitMQ
  • - Erlang- ~ 15 000 – 20 000 lines of code
  • WebSocket, is under developmentJavaScript for running on node.jsNot mainframes
  • - Why we selected the
  • AMQP 0-9-1 is a programmable protocol in the sense that AMQP entities and routing schemes are defined by applications themselves, not a broker administrator‘A low-level interface. It typically refers to programming interfaces (APIs) in a network directly above the physical layer that are used strictly for transport or interconnection. It often refers to protocols that invoke functions such as CORBA, DCOM, RMI and SOAP. It may also refer to database and other such interfaces.
  • - Exchanges: Direct exchange(Empty string) and DEFAULT Fanout exchange amq.fanout Topic exchangeamq.topic Headers exchange amq.match (and amq.headers in RabbitMQ)Durability (exchanges survive broker restart)Auto-delete (exchange is deleted when all queues have finished using it)Additional arguments that is broker dependent
  • Publish/Subscribe- Massively multi-player online (MMO) games can use it for leaderboard updates or other global events- Sport news sites can use fanout exchanges for distributing score updates to mobile clients in near real-time- Distributed systems can broadcast various state and configuration updates- Group chats can distribute messages between participants using a fanout exchange (although AMQP does not have a built-in concept of presence, so XMPP may be a better choice)Topic - Background task processing done by multiple workers, each capable of handling specific set of tasks- Stocks price updates (and updates on other kinds of financial data)- News updates that involve categorization or tagging (for example, only for a particular sport or team)- Orchestration of services of different kinds in the cloudDistributed architecture/OS-specific software builds or packaging where each builder can handle only one architecture or OSDistributing data relevant to specific geographic location, for example, points of salePublish/Subscribe - All receive
  • Messages should not be duplicated Producers: Diverse Crawlers Input Streams Content AgnosticConsumers: Parallel processes - Parses and organize the content. - Same codebase
  • Redelivery of failed tasksMessage durabilityQueue durabilityMessage durabilityFair dispatch- PrefetchCount = X
  • The bigrabbit

    1. 1. About us:Name: Tarjei RomtveitBorn: 1983-02-23Current Title: Data Management Director and SoftwareDeveloper at Integrasco A/STitle period: June 2006 – PresentEducation: M.Sc., ICT, System Development, University of Agder(2010)Specialties:Java,Lean Production, RabbitMQ , Spring Framework, WebServices, Hibernate, Maven, Python, MySQL, Scala, XQuery, XPath,Linux (Rights,Scripting,Security,Apache and MySQL) 1
    2. 2. About us:Name: Enok K. EskelandBorn: 1986-06-18Current Title: Software Developer at Integrasco A/STitle period: June 2011 – PresentEducation: B.Sc., ICT, System Development, University of Agder(2011)Specialties:Java, Maven, Python, MySQL, XQuery, XPath 2
    3. 3. Scaling together with social media: RabbitMQ A scalability story Tarjei Romtveit & Enok Eskeland
    4. 4. The ProblemMillions of #SM users tend tolook in the same direction at thesame time
    6. 6. / “It never rains, but it pours!” “En ulykke kommer sjelden alene”
    7. 7. Storage CloudStorage Storage Storage Storage Storage StorageAgent Agent Agent Agent Agent AgentStorage Storage Storage Storage Storage StorageService Service Service Service Service ServiceBuffer Buffer Buffer Buffer Buffer Buffer/ / / / / /Stage Stage Stage Stage Stage StageCRM Forum Blogs YouTub Twitter Facebo e okOur currentsolution should 15 000 m/mhandle it.
    8. 8. Our currentsolution shouldhandle it. 15 000 m/m
    9. 9. #SM is not uniformlydistributed
    10. 10. In your darkesthours:My precioussoftware wentfrom a Maglev toanopen source
    11. 11. Startmeasure !!!
    12. 12. What was wrong… Storage CloudStorage Storage Storage Storage Storage StorageAgent Agent Agent Agent Agent AgentStorage Storage Storage Storage Storage StorageService Service Service Service Service ServiceBuffer Buffer Buffer Buffer Buffer Buffer/ / / / / /Stage Stage Stage Stage Stage StageCRM Forum Blogs YouTub Twitter Facebo e ok
    13. 13. • Start patching the old solution• Build from scratch• Start looking for external solutions
    14. 14. …. so what should we look for? A Storage Agent –Each pipeline - queue SM – producer tweet/post/discussion consumer – a message Facebook Storage Agent ?
    15. 15. …. well lets start to look #elsewhere
    16. 16. Requirements• Robust• Simple to setup• Simple to maintain• Guaranteed message delivery• Lightning fast
    17. 17. What RabbitMQ could offer: Robust• Replication• Clustering• Simple by design
    18. 18. What RabbitMQ could offer: Easy to maintain• Good Linux-distro support• Plugins – Management API Plugin – Management Plugin• Simplicity
    19. 19. DEMO 1•
    20. 20. What RabbitMQ could offer: QOS• Guaranteed message delivery – Transactions – Publisher confirms• Persistent messages
    21. 21. What RabbitMQ could offer: Performance • ~10 000 m/s per broker
    22. 22. Additional features Language support:• Java Spring client C# erlang java php• Lots of clients Python rubySupported platforms: PerlSolaris C++BSD ListLinux HaskellMacOSXTRU64Windows NT/2000/XP/Vista/Windows 7Windows Server 2003/2008Windows 95, 98VxWorks
    23. 23. … we made our client from scratch• Configuration• Failover• Publisher Confirms
    24. 24. AMQP 101• Competitor to JMS• Network wire-level protocol• Programmable protocol
    25. 25. AMQP 101 Channel Broker Channel Binding
    26. 26. MESSAGE PATTERNS Publish/Subscribe 2. Working Queues/Task Queue3. Routing 4. Topics Consumer/ Exchange Queue Producer
    27. 27. SO WHAT PATTERN ?- Messages should not be duplicated- Producers: Diverse- Consumers: Parallel processes
    28. 28. We selected the Work Queues / Task Queues Pattern Storage Agent Facebook Storage Agent • Round-robin dispatching • Redelivery of failed tasks • Message durability • Fair dispatch
    29. 29. Multiple input queuesStorage Storage Storage Storage Storage Storage Storage StorageAgent Agent Agent Agent Agent Agent Agent Agent Facebook Twitter YouTube Forums
    30. 30. DEMO 2 : How we first started out• git:// clients.git
    31. 31. Extra features: Clustering• Easy to setup – rabbitmqctl cluster rabbit@rabbit1 rabbit@rabbit2• DISC node OR RAM node• Replicates the queues and messages• NB! No sync protocol• Enables mirrored queues
    32. 32. DEMO 3: Clustering and mirrored queues•
    33. 33. Extra features: Publisher confirms• Solution for guaranteed consumer – broker delivery• Non AMQP• Asynchronous – faster than transactional• Not supported in Spring client• Requires extra handling in the client
    34. 34. DEMO 4: Publisher Confirms• git:// clients.git
    35. 35. Additional Components:• SMS and e-mail alert process – Management REST API – Surveillance of incoming/outgoing• Central distribution of configuration – KISS: HTTP – Considering to use ZooKeeper
    36. 36. Main experience• Do not trust persistence/durability entirely• There are no sync protocol in clusters• Minimize the broker interaction in client• Failover and connection pooling is hard• Use the mailing list
    37. 37. So what did we accomplish• Stabilize and scale the staging component• Enabling us to focus on core processes• 50 % less maintenance