Your SlideShare is downloading. ×
JavaOne 2008 - TS-6048 - Complex Event Processing at Orbitz
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

JavaOne 2008 - TS-6048 - Complex Event Processing at Orbitz


Published on

A multimedia recording of this presentation can be found at …

A multimedia recording of this presentation can be found at


The Orbitz Worldwide Data Centers host numerous leading online travel agency websites that utilize thousands of services distributed across hundreds of Jini connected VMs to service millions of monthly visitors. Monitoring and managing these large-scale, complex applications is a daunting task. Failure happens and downtime is money! Orbitz Worldwide has harnessed the power of Complex Event Processing to handle a torrent of monitoring events with minimal application development and hardware costs. The resulting system has improved manageability by reducing the Mean Time To Resolution (MTTR) for customer impacting events caused by software availability, reliability and performance issues.
Orbitz Worldwide has developed a proprietary Java instrumentation API named the Extremely Reusable Monitoring API (ERMA). ERMA is as simple to use as a logging API, yet flexible enough through configuration to satisfy most requirements for logging, monitoring, analytics and other event processing needs. ERMA dynamically correlates events across distributed VMs servicing a user request, enabling efficient drill-down root cause analysis for errors and latency as well as bottom-up impact analysis. ERMA has been applied using Filters, Interceptors, Listeners, Spring-AspectJ AOP integration and custom instrumentation of core Orbitz Worldwide object models. As a result we have access to data for over 100k distinct event types with minimal development cost.

Monitoring data corresponding to discrete events is streamed through ERMA from hundreds of VMs to a Complex Event Processing (CEP) engine in real-time where it is aggregated and processed with high throughput and low latency. A single 2-way commodity computer executing our most elaborate event processing application is able to handle nearly 100,000 events per second. The ability to handle such a large volume of data enables us to monitor services at a very fine-grained resolution as needed. Also, the hardware cost of adding new monitoring applications is minimal using this technology.

A high-level event processing language provided with the CEP engine makes it possible to develop new monitoring applications quickly and easily. A visual development environment makes it easy to trace event flow and wire in new functionality. The event processing language has been extended by Orbitz Worldwide with custom Java functions and operators tailored to the Orbitz Worldwide environment. For example, we have developed an operator that can deliver streams of data via SNMP using the OpenNMS API in order to integrate with our Service Operations Center infrastructure.

A Java portal has been developed to visualize the output from the CEP engine. The portal presents tabular and graphical views of vital system statistics. It also publishes RSS feeds for alarms. Users can subscribe to feeds for particular alarm severities and/or affected applications.

The future of Complex Event Processing at Orbitz Worldwide includes Event Pattern Monitoring capabilities. We are developing a solution that will reduce the volume of alarms delivered to the Operator by bundling Customer impacting event information with root cause estimation determined by detecting patterns of discrete events. As our business grows, it is imperative that our Operations team can manage the system in a scalable manner by relying on automated actionable event detection. Complex Event Processing is the solution to this problem for Orbitz Worldwide.

Published in: Technology, News & Politics
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. COMPLEX EVENT PROCESSING AT ORBITZ WORLDWIDE Matt O’Keefe, Senior Architect Doug Barth, Technical Lead TS-6048
  • 2.
      • How to make pager duty suck less
  • 3. $10.8 Billion in Gross Bookings in 2007
  • 4. Dozens of Apps, Hundreds of VMs and Thousands of Services
  • 5. The Need for Abstraction Webapp Travel Business Services Switching Services abstraction Transaction Services Suppliers
  • 6. Complex Event Processing is…
    • …about sensing and responding to threats and opportunities in real time.
  • 7. The Big Picture Event Processors Monitored Apps Operations Center Graphite JMX, ssh ERMA SNMP Portal
  • 8. ERMA
    • E xtremely R eusable M onitoring A PI
  • 9. The Monitor Interface
  • 10. Using EventMonitors protected void doValidate(RequestContext context, Object formObject, Errors errors) throws Exception { super.doValidate(context, formObject, errors); if (errors.hasErrors()) { EventMonitor validationMonitor = new EventMonitor("ValidationErrors"); validationMonitor.set("errors", errors.getAllErrors());; } }
  • 11. The CompositeMonitor Interface
  • 12. Using TransactionMonitors public Hotel findHotelById(Long id) { TransactionMonitor monitor = new TransactionMonitor(getClass(), "findHotelById"); monitor.set(“id”, id); try { Hotel hotel = em.find(Hotel.class, id); monitor.succeeded(); return hotel; } catch (RuntimeException e) { monitor.failedDueTo(e); throw e; } finally { monitor.done(); } }
  • 13. TransactionMonitorTemplate public void cancelBooking(final Long id) { TransactionMonitorTemplate.INSTANCE.doInMonitor( getClass(), "cancelBooking", new TransactionMonitorCallback() { public Object doInMonitor(TransactionMonitor monitor) { monitor.set("id", id); Booking booking = em.find(Booking.class, id); if (booking != null) { em.remove(booking); } return null; } }); }
  • 14. Interceptors public interface HandlerInterceptor { boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception; void postHandle(HttpServletRequest request, HttpServletResponse response, Object handler, ModelAndView modelAndView) throws Exception; void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex) throws Exception; }
  • 15. Listeners public interface FlowExecutionListener { public void requestSubmitted(RequestContext context); public void requestProcessed(RequestContext context); public void sessionStarting(RequestContext context, FlowDefinition definition, MutableAttributeMap input); public void sessionCreated(RequestContext context, FlowSession session); public void sessionStarted(RequestContext context, FlowSession session); public void eventSignaled(RequestContext context, Event event); public void stateEntering(RequestContext context, StateDefinition state) throws EnterStateVetoException; public void stateEntered(RequestContext context, StateDefinition previousState, StateDefinition state); public void paused(RequestContext context, ViewSelection selectedView); public void resumed(RequestContext context); public void sessionEnding(RequestContext context, FlowSession session, MutableAttributeMap output); public void sessionEnded(RequestContext context, FlowSession session, AttributeMap output); public void exceptionThrown(RequestContext context, FlowExecutionException exception); }
  • 16. Annotations @Monitored public interface UnderpantsGnomes { void collectUnderpants(); void ?(); void profit(); }
  • 17. Spring/AspectJ <aop:config> <aop:aspect id=&quot;transactionMonitorActionAspect&quot; ref=&quot;transactionMonitorActionAdvice&quot;> <aop:pointcut id=&quot;transactionMonitorActionPointcut“ expression=&quot; target(org.springframework.webflow.execution.Action) and args(context) &quot;/> <aop:around pointcut-ref=&quot;transactionMonitorActionPointcut“ method=&quot; invoke &quot;/> </aop:aspect> </aop:config> <bean id=&quot;transactionMonitorActionAdvice&quot; class= &quot; c.o.webframework.aop.aspectj.TransactionMonitorActionAdvice &quot;/>
  • 18. Joining Monitors - Across Threads TransactionMonitor monitor = new TransactionMonitor(&quot;Profit&quot;); ErmaCallable callable = new ErmaCallable( new CollectUnderpantsCallable() ); try { Future future = executorService.submit(callable); future.get(FUNDING_DURATION, TimeUnit.MILLISECONDS); monitor.addChildMonitor(callable.getMonitor()); monitor.succeeded(); } catch (TimeoutException e) { monitor.failedDueTo(e); throw e; } finally { monitor.done(); }
  • 19. Joining Monitors - Distributed Services public DispatcherResponse doFilter(DispatcherRequest request, Object resource, FilterChain chain) throws Throwable { TransactionMonitor monitor = new TransactionMonitor(getName(request)); MonitoringEngine mEngine = MonitoringEngine.getInstance(); Map inheritableAttributes = mEngine.getInheritableAttributes() ; Map serializableAttributes = mEngine.makeSerializable(inheritableAttributes); HashMap ermaAttributes = new HashMap(serializableAttributes); request.addParameter( ERMA_FILTER_PARAM_KEY , OLCSerializer.serialize(ermaAttributes) ); DispatcherResponse response = chain.doFilter(request, resource); … return response; }
  • 20. Distributed Services (cont.) public DispatcherResponse doFilter(DispatcherRequest request, Object resource, FilterChain chain) throws Throwable { … DispatcherResponse response = chain.doFilter(request, resource); Object responseBlob = response.getParameter( ERMA_FILTER_PARAM_KEY ); Monitor responseMonitor = (Monitor) OLCSerializer.deserialize ((byte[]) responseBlob); monitor.addChildMonitor(responseMonitor); Throwable throwable = response.getThrowable(); if (throwable == null) { monitor.succeeded(); } else { monitor.failedDueTo(throwable); } return response; }
  • 21. The MonitorProcessor Interface public interface MonitorProcessor { public void startup(); public void shutdown(); public void monitorCreated(Monitor monitor); public void monitorStarted(Monitor monitor); public void process(Monitor monitor); }
  • 22. A MonitorProcessorAdaptor Example public class ResultCodeAnnotatingMonitorProcessor extends MonitorProcessorAdapter { public void process(Monitor monitor) { if (monitor.hasAttribute(&quot;failureThrowable&quot;)) { Throwable t = (Throwable) monitor.get(&quot;failureThrowable&quot;); while (t.getCause() != null) { t = t.getCause(); } monitor.set(&quot;resultCode&quot;, t.getClass().getName()); } else { monitor.set(&quot;resultCode&quot;, &quot;success&quot;); } } }
  • 23. LoggingMonitorProcessor Output process: c.o.monitoring.api.monitor.TransactionMonitor -> blockedCount = 9 -> blockedTime = 17 -> cpuTimeMillis = 1470.0 -> createdAt = Sun Mar 09 16:13:17 CDT 2008 -> endTime = Sun Mar 09 16:13:30 CDT 2008 -> failed = false -> hostname = oberon -> latency = 13180 -> name = httpIn_/shop/airsearch/search/air/pageView_airResults -> remoteIpAddress = -> sessionId = D417C9CC585F82E1 -> threadId = 2a20ec -> validationFailure = false -> vmid = wl -> waitedCount = 10 -> waitedTime = 10414
  • 24. EventPatternLoggingMonitorProcessor Output wl|httpIn_/shop/airsearch/search/air/pageView_airResults|13180 wl|RoundTripAirSearchAction.resolveRoundTripAirLocations|136 wl|jiniOut_LocationFinderService_findAirports|84 tbs-shop-13.31|jiniIn_LocationFinderService_findAirports|31 tbs-shop-13.31|jiniOut_AirportLookupService_findLocationByIATACode|9 market-8.4|jiniIn_AirportLookupService_findLocationByIATACode|3 tbs-shop-13.31|jiniOut_AirportLookupService_findLocationByIATACode|15 market-8.4|jiniIn_AirportLookupService_findLocationByIATACode|10 wl|jiniOut_LocationFinderService_findAirports|48 tbs-shop-13.31|jiniIn_LocationFinderService_findAirports|16 tbs-shop-13.31|jiniOut_AirportLookupService_findLocationByIATACode|14 market-8.4|jiniIn_AirportLookupService_findLocationByIATACode|10 wl||10422 wl|jiniOut_ShopService_createResultSet|9798 tbs-shop-13.31|jiniIn_ShopService_createResultSet|9601 tbs-shop-13.31||9361 tbs-shop-13.31|com.orbitz.tbs.spi.SpiShopService.createResultSet.AIR|9333 tbs-shop-13.31|jiniOut_LowFareSearchService_execute|9175 air-search-7.2.1|jiniIn_LowFareSearchService_execute|9094 air-search-7.2.1|LowFareSearchRequest|9048 air-search-7.2.1||9038 air-search-7.2.1||9037 wl|jiniOut_ShopService_viewResultSet|607 tbs-shop-13.31|jiniIn_ShopService_viewResultSet|486 wl|pageView_airResults wl||2475
  • 25. EventPatternLoggingMonitorProcessor Output wl|| NoSearchResultsAvailableException wl|jiniOut_ShopService_createResultSet| NoSearchResultsAvailableException tbs-shop|jiniIn_ShopService_createResultSet| NoSearchResultsAvailableException tbs-shop|c.o.t.h.s.ShopServiceImpl.createResultSet.AIR| NoSearchResultsAvailableException tbs-shop|c.o.t.s.SpiShopService.createResultSet.AIR| NoSearchResultsAvailableException tbs-shop|jiniOut_LowFareSearchService_execute| SearchSolutionNotFoundException air-search|jiniIn_LowFareSearchService_execute| SearchSolutionNotFoundException air-search|LowFareSearchRequest| SearchSolutionNotFoundException Follow the trail of Exceptions… don’t bother the on-call engineers for the higher layers… save time by narrowing your log search query!
  • 26. Event Processing
  • 27. EventFlow
  • 28. Expression Language !failed latency > 1000 && failed totalFailed/total sum(int(failed)) stddev(latency) indexof(name, ‘jini’) == 0 strftime(“%Y%m%d, windowTimestamp)
  • 29. Event Processing Demo StreamBase Studio
  • 30. Custom Functions public class HelloWorld { public static String sayHello(String name) { return &quot;Hello, “ + name; } }
  • 31. Custom Operators public void processTuple(int inputPortId, Tuple tuple) throws StreamBaseException { String name = tuple.getString(&quot;name&quot;); Tuple output = outputSchema.createTuple(); output.setField(&quot;message&quot;, &quot;Hello, &quot;+name); sendOutput(OUTPUT_PORT, output); }
  • 32. Visualization
  • 33. SNMP
  • 34. Graphite
  • 35. Graphite RESTful URLs target=tbs-shop.all.jiniIn.ShopService.createResultSet#all.count
  • 36. Graphite RESTful URLs - Wildcards target=tbs-shop.all.jiniIn.ShopService. * .count
  • 37. Graphite RESTful URLs - Functions target= derivative( tbs-shop.all.jiniIn.ShopService.*.count )
  • 38. Graphite RESTful URLs – Pie Charts target=cpuTime:790&target=waitedTime:4043&target=blockedTime:0&target=other:31
  • 39. Graphite RESTful URLs – Raw CSV Data target=tbs-shop.all.jiniIn.ShopService.*.count& rawData=true 0 0 0 0 0 0 0 0 0 0 0 0 0 None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None 84 92 90 84 76 81 78 84 89 83 85 81 89 None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None None 81 91 94 82 79 81 79 83 87 84 82 86 85 None None None None None None None None None None None None None 0 0 0 0 0 1 0 0 0 0 0 0 0 None None None None None None None None None None None None None 0 0 0 0 0 0 0 0 0 0 0 0 0 81 91 94 82 79 82 79 83 87 84 82 86 85
  • 40. Graphite CLI - Dashboards
  • 41. Monitoring Portal
  • 42. RSS
  • 43. Command Line Stream Discovery [mokeefe@egcep03 ~]$ sbc list input-streams input-stream DeleteFromThresholds input-stream GarbageCollectorStats input-stream ListActiveAlarmsInput input-stream MemoryPoolStats input-stream MemoryStats input-stream MonitorInput input-stream MonitoringEngineManager_lifecycle input-stream ReloadLog4j input-stream ThreadStats input-stream ThresholdInput …
  • 44. Command Line Schema Discovery [mokeefe@egcep03 ~]$ sbc describe MonitorInput <stream name=“…&quot; schema=“…&quot; uuid=&quot;...&quot;> <schema name=&quot;schema:MonitorInput&quot; uuid=“…&quot;> <field name=&quot;name&quot; size=&quot;256&quot; type=&quot;string&quot;/> <field name=&quot;vmid&quot; size=&quot;128&quot; type=&quot;string&quot;/> <field name=&quot;hostname&quot; size=&quot;128&quot; type=&quot;string&quot;/> <field name=&quot;threadId&quot; size=&quot;32&quot; type=&quot;string&quot;/> <field name=&quot;createdAt&quot; type=&quot;timestamp&quot;/> <field name=&quot;endTime&quot; type=&quot;timestamp&quot;/> <field name=&quot;latency&quot; type=&quot;int&quot;/> <field name=&quot;cpuTimeMillis&quot; type=&quot;int&quot;/> <field name=&quot;failureThrowable&quot; size=&quot;16384&quot; type=&quot;string&quot;/> <field name=&quot;failed&quot; type=&quot;bool&quot;/> <field name=&quot;resultCode&quot; size=&quot;128&quot; type=&quot;string&quot;/> <field name=&quot;sessionId&quot; size=&quot;32&quot; type=&quot;string&quot;/> <field name=&quot;locale&quot; size=&quot;24&quot; type=&quot;string&quot;/> <field name=&quot;remoteIpAddress&quot; size=&quot;15&quot; type=&quot;string&quot;/> <field name=&quot;posCode&quot; size=&quot;4&quot; type=&quot;string&quot;/> </schema> </stream>
  • 45. Command Line Queries [mokeefe@egcep03 ~]$ sbc dequeue MonitorInput --where &quot; sessionId == '570EA1FABEE015B9' and vmid='tbs-shop-13.31' &quot; jiniOut_AirportLookupService_findLocationByIATACode,tbs-shop-13.31,,67484c,2008-03-10 15:40:42.489-0500,2008-03-10 15:40:42.489-0500,2008-03-10 15:40:42.502-0500,13,null,null,null,false,null,null,GBP,570EA1FABEE015B9,English (United Kingdom),,null,EBUK jiniIn_LocationFinderService_findAirports,tbs-shop-13.31,,67484c,2008-03-10 15:40:42.480-0500,2008-03-10 15:40:42.480-0500,2008-03-10 15:40:42.503-0500,23,null,null,null,false,null,success,GBP,570EA1FABEE015B9,English (United Kingdom),,null,EBUK …
  • 46. Event Pattern Monitoring
  • 47. Event Pattern Monitoring (cont)
  • 48. Event Pattern Monitoring Work In Progress Orbitz Worldwide Real-time Clickstream Analysis w/Correlation
  • 49. Recap Event Processors Monitored Apps Operations Center Graphite JMX, ssh ERMA SNMP Portal
  • 50. Future Directions
    • Orbitz Worldwide will open source ERMA
    • Orbitz Worldwide will open source Graphite
    • Orbitz Worldwide is considering open sourcing of
    • orbitz-lib-streambase
    • Esper integration
  • 51. For More Information
    • The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems, David Luckham, Addison Wesley Professional, May 2002, ISBN: 0201727897
    • http://
    • ERMA and Graphite: Watch for an announcement
    • http:// /careers/
    • StreamBase: http:// /
    • Esper:
  • 52. Matt O’Keefe, Senior Architect Doug Barth, Technical Lead TS-6048
  • 53. Appendix: Filters public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain) throws … { TransactionMonitor monitor = new TransactionMonitor(getClass(), &quot;doFilter&quot;); try { filterChain.doFilter(servletRequest, servletResponse); monitor.succeeded(); } catch (IOException e) { monitor.failedDueTo(e); throw e; } catch (ServletException e) { monitor.failedDueTo(e); throw e; } catch (RuntimeException e) { monitor.failedDueTo(e); throw e; } finally { monitor.done(); } }
  • 54. Appendix: Interceptors public class ERMAInterceptor extends HandlerInterceptorAdapter { public boolean preHandle(…) throws Exception { TransactionMonitor monitor = new TransactionMonitor(&quot;httpIn&quot;); monitor.setInheritable(POSCODE, getPosCode()); monitor.setInheritable(LOCALE, getLocale()); monitor.setInheritable(CURRENCY, getCurrency()); monitor.setInheritable(CHANNEL, getChannel()); monitor.setInheritable(SESSION_ID, getSessionId()); monitor.setInheritable(IP_ADDR, getClientAddress()); }
  • 55. Appendix: Interceptors (cont.) public void postHandle(…) throws Exception { View view = modelAndView.getView(); if (view == null) { String viewName = modelAndView.getViewName(); EventMonitor monitor = new EventMonitor(&quot;pageView_&quot; + viewName);; } else if (RedirectView.class.isAssignableFrom( view.getClass())) { String redirectUrl = extractDispatcherPath(request); EventMonitor monitor = new EventMonitor(&quot;redirect_&quot; + redirectUrl);; } }
  • 56. Appendix: Interceptors (cont.) public void afterCompletion(…, Exception exception) throws Exception { TransactionMonitor httpInMonitor = (TransactionMonitor) MonitoringEngine.getInstance(). getCompositeMonitorNamed(&quot;httpIn&quot;); if (exception == null) { httpInMonitor.succeeded(); } else { httpInMonitor.failedDueTo(exception); } httpInMonitor.set(Monitor.NAME, constructNewMonitorName(httpInMonitor)); httpInMonitor.done(); }
  • 57. Appendix: Listeners public void stateEntering(RequestContext context, StateDefinition nextState) throws EnterStateVetoException { EventMonitor monitor = new EventMonitor(&quot;flowExecution.stateEntering&quot;); StateDefinition currentState = context.getCurrentState(); monitor.set(&quot;currentStateId&quot;, currentState.getId()); monitor.set(&quot;nextStateId&quot;, nextState.getId());; }