Running an erlang based messaging system on AWS


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Running an erlang based messaging system on AWS

  1. 1. Running an Erlang BasedMessaging System on AWSLahav Savir, Architect & CEOEmind systems
  2. 2. AboutLahav Savir• 15++ years’ experience• Architect and CEO @ Emind SystemsEmind Systems (est. 2006)• Highly professional system integrator• Dedicated Cloud Architects and DevOps teams• 24x7 SLA by DevOps Specialists• ~100 AWS customers• Partnerships with leading cloud vendors
  3. 3. PartnershipsAmazon Advanced Consulting Partner
  4. 4. What is MMGS ?MMGS is an End-to-End Mobile Messaging Gateway System• Mobile Instant Messaging Gateway that enables optimized connection tovarious back-end community protocols• Mobile E-Mail Gateway that enables optimized communication betweenhandset and any mail server• Control including Management, Authentication, Authorization, Billing andReporting capabilitiesMobile MessagingGateway System(MMGS)WV / XHTML / AJAXIMAP4 / SMTPWirelessNetworkXMPPGTalkISP MailServer
  5. 5. MMGS Supported Features• Supported Clients– Ogo family of devices– Generic Handset Client(J2ME)– Web Client (Mobile & PC)• Instant MessagingCommunities– Windows Live (MSP3.0)– ICQ, AIM– GTalk– Jabber (XMPP)– Facebook + Chat• E-mail Gateway Services– E-mail Push– IMAP, POP3, SMTP– Forward withoutDownload (LEMONADE)– Hotmail through MSP3.0
  6. 6. • Optimized IM Data Channel– Secure access to IM Services– Enables data optimization between Client and MMGS– Reduces data traffic consumption by 70%-80%– Increases battery life time• Optimized E-mail Access Services– Secure access (IMAP + SSL, SMTP + SSL)– Email Push using IMAP IDLE (supports POP3, IMAP, Hotmailmailboxes)– E-mail attachments– E-mail content adaptation– Reduces data traffic consumption by 70%-80%Optimizations & Enhancements
  7. 7. Architecture Approach• Focus on GW features• 3 Tiers– Front-end, Router, Transport– High-Speed RPC System(eTunnel)• Security– No access to core switching& session data– Limited access to transportnodes– One way firewalls• Clustering– No single point of failure– Redundant components– In-memory replicated DB(minimal shared data)– Internal health system– Advanced failover• Load Balancing– Hash based distribution– Consistent paths– Least connections– URL load balancing
  8. 8. FIFO Queue(add & clear)Generic Transport nodeIn-memory cache/ Disk cacheNotificationHandlerThrottleCommunity ServersLogSNMPO&M
  9. 9. System ArchitectureWindowsLive ServiceICQ ServiceAOLServiceYMSGServiceWV Plug-inRouterMSPPlug-inICQPlug-inAOL Plug-inXMPPPlug-inWV Front-end CIR Front-endWBXMLHTTP WVoTCP CIR ServerXMPP Plug-inXMPP Front-endXMPP APIPrivate IMFront-endRouterTransport3rdPartiesWeb Front-endIM ClientWeb Plug-inUser InterfaceGIS Plug-inSMTPPlug-inPOP3Plug-inIMAPPlug-inMSPAOLWVProto.XMPPXMPPProto.ICQWVProto.MSPMSPProto.IMAPIMAPProto.POP3POP3Prot.SMTPSMTPProto.BOSHMail ServiceContactsXML/HTTPContactsPluginContactServiceYMSGWVProto.Profile Svc.Plug-inCacheVideo RelayPlug-inFacebookFBProto.TwitterTwitterProto.Facebook TwitterFacebook TwitterIMAP Front-endIMAPSMTP Front-endSMTPSMTP Plug-inIMAP Plug-inGtalkServiceMMGSAvailableComponenetMMGS FutureComponentLegend3rdPartySystemContentConvertorReportingAuthenticationManagementMonitoringLoggingSatelliteSystems
  10. 10. No single point of failureTransportsRoutersFront-endsFirewalls FirewallsLoadBalancersLoadBalancerseTunnel eTunnel
  11. 11. AWS Deployment
  12. 12. Performance Benchmark MeasuresSetup• Basic cluster built of 11 servers– m1.medium and m1.large instancesInstant Messaging performance• 150,000 Connected users• 12,600 Messages per second• 45,000,000 Messages an hourSystem resources• 50%-60% system utilization
  13. 13. What Made the Difference?• High speed socket based RPC – eTunnel• Resource pools (eBalancer)• Minimal data sharing / Distributing the data• Hash based distribution between layers makes consistent path throughlayers• Health checks Integration with load balancer• Reduce destructive DB operations, especially on replicated tables• Internal health system integrated with topology manager• Extensive SNMP MIBs• Throttling – Control concurrency
  14. 14. Flexibility for business models• Reporting sys.integrated withauthentication &authorization• Supports anybusiness model• Daily / monthlypay• Per use• Per Message• Per data size• Per Service
  15. 15. Satellite Systems• Authentication– Authentication, Authorization & Control over the servicedelivered• Reporting– Nearly real-time usage reports aggregated to an hour cycle• Operation and Maintanance– Stop / Start, Watch real-time alarms, configuration– Traceability and troubleshooting– Software update - hot patch distribution and loading– Central log console– Central graphing system• Real Time Monitoring – Infrastructure, OS and App
  16. 16. Authentication SystemThe Authentication System is a satellite system providing full Authentication,Authorization & Control over the service delivered by MMGS• Source IP based access• Device based access (enable / disable a device)• Service based access (enable / disable a service for a device)– MSN, ICQ, AIM, Jabber, Gtalk, IMAP, POP3, Hotmail, Facebook• Feature based access (enable / disable a specific feature for a service)– Nudge, Tiles, E-mail Push, E-mail Attachments• User friendly, secure and easy to use Web GUI• Feature Rich and secure XML over HTTP API to enable operators tomanage the service delivered to their customer
  17. 17. Authentication UI
  18. 18. Reporting SystemThe reporting system provides a nearly real-timeusage reports aggregated to an hour cycle• Operator based reports, summary of activities for anoperator• Service based reports, summary of activities for a service• Device based reports, summary of activities for device• Secure Web Based User Interface• Secure XML over HTTP rich API to support Operatorsbilling systems
  19. 19. Reporting system UI
  20. 20. Operation & MaintenanceThe Operation & Maintenance provides full control overthe MMGS servers• Complete cluster management– Stop / Start, Watch real-time alarms, configuration– Traceability - trace system control over, stop / start devicebased trace, configure logging level– Troubleshooting – configuration mismatch, versionmismatch, topology mismatch– Software update - hot patch distribution and loading• Central Log console• Central graphing system provides an in depth view intothe systems counters (real-time and historical)
  21. 21. Operation & Management UI
  22. 22. Real-Time MonitoringReal-Time Monitoring system provides an indepth monitoring of the system’s resourcesand performance via Web Based UI• Service operator view• Customer view• Service counters thresholds (QoS, success rates, message latency)• Application Counters threshold (application concurrency)• Application Services (Protocols)• Operating System threshold (CPU, Mem, Disks, Net)• Hardware Resources
  23. 23. Graph system UI
  24. 24. Summary• Mobile Aware• Optimized data channels• Longer battery life• Protocol Conversion• Content Adaptation• Authentication &Authorization• Service, Device and Featurelevels• Billing• Operator, Device andService levels• Management & Control• User friendly operation• In depth system visibility• Redundancy & Scalability• Active/active (no singlepoint of failure)• Linear scalability• Proven high performance• Security• One way firewall• 3 Physical layers
  25. 25. (54) 4321688