Building reliable systems from unreliable components

VP, chief architect at Kaltura
Mar. 13, 2011

More Related Content


Building reliable systems from unreliable components

  1. Arnon Rotem-Gal-Oz Building reliable systems from unreliable components
  2. Arnon Rotem-Gal-Oz
  3. What’s in a 9
  4. 0.99 reliability We have a nice little legacy business component
  5. And we move it to SOA 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99
  6. Failsafe hardware Status Technologies FT Server
  7. Or try to detect failure , handle it and minimize its effect on the business service © Rosendahl
  9. SOA Adheres to Policy governed by Binds to End Point Exposes Serves Service Understands Contracts implements Service Consumer describes Key Component Sends/Receives Messages Sends/Receives Relation
  10. Distributed Pipes and Layered Client Agents Filters System Server Stateless Comm. SOA is derived from SOA other styles
  11. SOA vs. REST Distributed Pipes and Layered Client Replicated Uniform Virtual Agents Filters System Server Repository Interface Machine Stateless Code On Cacheable Comm. Demand SOA REST
  12. System Mobile Integration 3G Video Dedicated Monitoring MMS 3rd parties Calls Client Usage Datmart Applications Ad Management Targeted Campaign Billing Acquisition Interactions branding Advertizing Mgmt. Reporting Resources Data mining Reference & Statistics Interactions Links Data Link Managment Reports Publishing Data Web Interaction tools Interfaces Front-end Designer integration
  13. Load balancer Smart phones Web Web MMS 3G 3G Server Server Gateway Gateway Gateway (IIS/Apache (IIS/Apache Camera Phones DMZ Firewall Load balancer App App App App App App App Server Server Server Server Server Server Server Operational Firewall Backend BI & Sync. Paper NMS Links Reporting Registeration Server Editor Web Firewall Server (IIS/Apache DB Datamart DB Advertizing clients Links usage References Admin Console DMZ
  15. What’s the effect of a failure - Server E1 line = 30 concurrent video Call Flow calls Service
  16. Service Instance Service Business logic End point reaction Distribute request Dispatcher Edge Service Instance
  17. What’s the effect of a failure - Channel Call Flow Call Flow Channel Call Flow Call Flow Call Flow Call FlowChannel Channel Channel Flow Channel Call Channel Call Flow Channel Call Flow Call Flow E1 line = 30 Channel Call Flow concurrent video Call Flow Channel Channel Channel calls Channel Call Flow Call Flow Channel FlowCall Channel Call FlowChannel Call FlowChannel Channel
  18. Service Instance with NLB Real IP : Real IP : Real IP : Windows Host Cluster Host Cluster Host Service Service Edge Instance Instance Windows Kernel Windows Kernel Windows Kernel TCP/IP TCP/IP TCP/IP NLB Driver NLB Driver NIC Driver NIC Driver NIC Driver NIC NIC NIC Virtual IP :
  19. Virtual Endpoint
  20. Request/Reply EndPoint 2. 1. Request Synchronous processing 3. Reply Service Consumer Service
  21. Things look Cool & Simple ™ 3G Session Call negations Render Image resutls extraction Translation identification to links
  22. 3G GW IVP (RV) (RV) SIP Listner 3G VAS (Cestel) RTP Image Extractor WS 3G Builder (Cestel) Alg. Engine WebRenderer Resource Manager Dispatcher WebConnector Turn out Complicated & Ugly ™
  23. Parallel Pipelines Key SOA Component Pattern Component Queue Relation Concern/attribute pipeline pipeline Edge Request EndPoint Request 1 EndPoint EndPoint Request 2 Perform Perform Reaction Task Task pipeline EndPoint Perform Task Service
  24. Inversion of Communications
  25. Consumer view var sendMmsEvent = new SendMmsEvent() { FromNumber = simpleMessageDetails.DialedNumber, Subject = mmsContents.Subject, ToNumber = simpleMessageDetails.Sender, ImageExtension = mmsContents.ImageExtension, ImageAsByteArray = mmsContents.Image, TextAsByteArray = mmsContents.Text }; eventBroker.RaiseEvent(sendMmsEvent);
  26. Service view [ServiceContract] public interface ImPostOffice : ImContract, IHandleSendCoupon, IHandleSendSms,IHandleStatus,IHandleAdminStatus,IHandleWapLink, IHandleSendMms { } [ServiceContract] public interface IHandleSendMms { [OperationContract] int SendMms(SendMmsEvent eventOccured); } [DataContract] public class SendMmsEvent : ImEvent { /// <summary> /// end user's number. should be in international format: +[country-code]number. Example: +491737692260 /// </summary> [DataMember] public string ToNumber { get; set; } /// <summary> /// service's number, usually a short-code. Example: 84343 /// </summary> [DataMember] public string FromNumber { get; set; }
  27. Edge translates external structures to internal ones public int SendMms(SendMmsEvent eventOccured) { var eventContext = eventOccured.ToString(); if (log.IsDebugEnabled) log.Debug("inside 'SendMms', event context = [" + eventContext + "]"); var fromNumber = eventOccured.FromNumber; var sender = mmsSenderFactory.Get(fromNumber); if (null == sender) { if (log.IsWarnEnabled) log.Warn("cannot get mms sender derived from '" + (fromNumber ?? "null") + "'"); return 0; } IMmsSubmitResponse response; try { var mmsMessageDetails = new MmsMessageDetails(eventOccured.ToNumber, eventOccured.TextAsByteArray, eventOccured.ImageAsByteArray, eventOccured.ImageExtension, eventOccured.Subject); response = sender.Submit(mmsMessageDetails); } catch (Exception ex) { log.Error("cannot send mms message, context = [" + eventContext + "]", ex); return 0; } if (log.IsInfoEnabled) { var responseMessage = (null == response) ? "null" : response.ToString(); log.Info("sent mms with event context = [" + eventContext + "], response = [" + responseMessage + "]"); }
  28. Sagas tie instances together for conversations
  29. Call Recovery Image Extractor SIP Identification
  30. Alternative : Orchestration Service Service reaction request Manage Protocol Process Schedule Coordinator route request monitor Workflow instance Offline designer Host Workflow Engine Workflows Auxiliary tools Orchestration platform
  31. Be Wary of Nano-Services
  33. Blogjecting Watchdog Edge Service EndPoint Monitor Watchdog Watchdog Monitor Agent Request Edge EndPoint Heal Reports Monitor Report Monitor Log Monitor
  34. Blogjects concept is about collaborating objects
  35. WatchDog Service A Service B WDWatcher
  36. WatchDog Call Recovery Image Extractor SIP Identification
  37. Service Monitor Collect Metrics collection Status Fault Monitoring Reporting & Security Dashboarding Edge/Service monitoring Status Policy governance Notify Commands Control In Monitor Act Edge/Service Service Monitor
  38. RESTful resource management root 1 2 3 Abcde/ Sessions/ Efgh/ / Resources/ Dispatchers/ Xyz/
  39. http://devrig:52141/RM/Sessions/abc/ • ATOMPUB – Session details • URI (ID) • State (start/end/status etc.) • Resources – Knows status – URI for the Resource representation on the RM – URI for the Resource itself
  40. Keep the BIT
  41. WatchDog Call Recovery Image Extractor X SIP Identification WatchDog 3G Call Liveliness Monitor
  43. Same event different subscribers Player (interaction Call Flow Renderer) Play Movie Event Bridge to 3rd Call Flow Party
  44. Routing [ServiceContract] [Participate("3G")] public interface ImPlayer : ImContract, IHandleCallStarted, IHandleC IHandlePlayMovie,IHandleCallAborted { } [ServiceContract] [Participate("3GPartner")] public interface ImXsightsGateWay : ImContract, IHandleCallAborted, IHandlePlayMovie, IHandleReadyForSearch, IHandleSearchStarted, IHandleJoinThirdParty { }
  45. Raise a saga initiating event Initiator A Participant A Capacity : 2 Participant B Initiator B Capacity : 1 Now what ?!
  46. Reservation Pattern
  47. Service Instance #1 Control Control Edge Edge Service Instance #2 Business Event Control Business Event Logic Broker Edge Broker Logic Resource Event Business Resource Allocator Service Host Logic Broker Allocator Service Host Resource Host Service Service Host Allocator Service Host Service Host Control Reservation Edge Event Business Broker Logic Resource Service Host Allocator Service Host
  48. Good old 2PC to the rescue
  49. Takeaways • Things break • Decouple • Fail Fast • Monitor & Detect • Compensate • Throw state unto others
  50. Arnon Rotem-Gal-Oz @arnonrgo