P106 rajagopalan-read


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

  • P106 rajagopalan-read

    1. 1. Profile-Directed Optimization of Event-Based Programs Mohan Rajagopalan Saumya K. Debray Department of Computer Science University of Arizona Tucson, AZ 85721, USA mohan, debray @cs.arizona.edu Matti A. Hiltunen Richard D. Schlichting AT&T Labs-Research 180 Park Avenue Florham Park, NJ 07932, USA hiltunen, rick @research.att.com ABSTRACT structure user interaction code in GUI systems [8, 18], form the 2010.05.17 Events are used as a fundamental abstraction in programs ranging from graphical user interfaces (GUIs) to systems for building cus- basis for configurability in systems to build customized distributed services and network protocols [4, 9, 16], are the paradigm used for tomized network protocols. While providing a flexible structuring : and execution paradigm, events have the potentially serious draw- asynchronous notification in distributed object systems [19], and are advocated as an alternative to threads in web servers and other back of extra execution overhead due to the indirection between types of system code [20, 23]. Even operating system kernels can modules that raise events and those that handle them. This pa- be viewed as event-based systems, with the occurrence of interrupts per describes an approach to addressing this issue using static opti- and system calls being events that drive execution. mization techniques. This approach, which exploits the underlying The rationale behind using events is multifaceted. Events are predictability often exhibited by event-based programs, is based on asynchronous, which is a natural match for the reactive execution first profiling the program to identify commonly occurring event behavior of GUIs and operating systems. Events also allow the sequences. A variety of techniques that use the resulting profile in- modules raising events to be decoupled from those fielding the formation are then applied to the program to reduce the overheads events, thereby improving configurability. In short, event-based associated with such mechanisms as indirect function calls and ar- programming is generally more flexible and can often be used to realize richer execution semantics than traditional procedural or
    2. 2. Cited By
    3. 3. GUI exploits the underlying predictability
    4. 4. , but rov- Components bound to more than one event. An event is ignored if no handlers are bound to the event. The execution order of multiple handlers time bound to the same event may be important. Bindings may be static, sub- i.e., remain the same throughout the execution of the program, or ased dynamic, i.e., may change at runtime. Figure 1 illustrates bindings. ques e de- head Events Handlers Handler 1 Event A on 2 Handler 2 Event B fol- zing Event C Handler 3 ction Handler 4 ives Event D ents Handler 5 and Cac- ser- Figure 1: Event bindings ution ws, a cus- Bindings are maintained in a registry that maps each event to ction a list of handlers. The registry may be implemented as a shared
    5. 5. onsists of two or more event handlers. Events in Cactus are user- procedure. efined. A typical composite protocol uses 10-20 different events In addition to these three, X has a number of other mechanisms Examples onsisting of a few external events caused by interactions with soft- ware outside the composite protocol and numerous internal events sed to structure the internal processing of a message or service equest. Each event typically has multiple event handlers. As a re- that can be broadly classified as event handling, namely timeouts, signal handlers, and input handlers. Each of these mechanisms allows the program to specify a procedure to be called when a given condition occurs. For all these handler types, X provides operations ult, Cactus composite protocols often have long chains of events for registering the handlers and activating them. nd event handlers activated by one event. Section 4 gives concrete xamples of events used in a Cactus composite protocol. The Cactus runtime system provides a variety of operations for 3. OPTIMIZATION APPROACH managing events and event handlers. In particular, operations are Compiler optimizations are based on being able to statically pre- rovided for binding an event handler to a specified event (bind) dict aspects of a program’s runtime behavior using either invariants in figure 3. The X server is a program that runs on each sy nd for activating an event (raise). Event handler binding is com- that always hold at runtime (i.e., based on dataflow analysis) or as- supporting a graphics display and is responsible for managing letely dynamic. Events can be raised either synchronously or sertions that are likely to hold (i.e., based on execution profiles). vice drivers. Application programs, also called X clients, ma Top API synchronously, and an event can also be raised with a specified Event-based systems, in contrast, are largely unpredictable in their elay to implement time-driven execution. The orderEvents han- Micro!protocols of event runtime behavior remote the the display system. X serversthe be- clients local or due to to uncertainties associated with and X the X-protocol for communication. X clients are typically bui ler execution can be specified if desired. Arguments can be passed the Xlib libraries using toolkits such as Xt, GTK, or Qt. X cli DESPrivacy o handlers in both the bind and raise operations.msgFromAbove Other operations are implemented as a collection X!Client Application are the b Devices of widgets, which re available for unbinding handlers, creating and deleting events, KeyedMD5Integrity msgFromBelow Device Drivers blocks of X applications. building alting event execution, and canceling a delayed event. Handler xecution is atomic with respect to concurrency, i.e., a handler is X!Server X event is defined as “a packet Toolkit sent by the serv An Xt of data Qt RSAAuthenticity openSession xecuted to completion before any other handler is started unless it the client in response to user behavior orXLib to window system cha oluntarily yields the CPU. Cactus does not directly support com- resulting from interactions between windows” [18]. Example ClientKeyDistribution keyMiss X events include mouse motion, focus change, and button p lex events, but such events can be implemented by defining a new X!Protocol ... ... These events are recognized through device drivers and relaye vent and having a micro-protocol raise this event when the condi- ons for the complex event are satisfied.Bottom API the X server, which in turn conveys them to X clients. The The X Window system. X is a popular GUI framework for Unix Figure 3: Architecture of X Window systems may choose t framework specifies 33 basic events. X clients ystems. The standard architecture of an X based system is shown spond to any of these based on event masks that are specifie bind time. Events are also used for communication between Figure 2: Cactus composite protocol gets. Events can arrive in any order and are queued by the X cl Event activation in X is similar to synchronous activation in 108 general model. The X architecture has three mechanisms for handling ev bound to the event at that time results in a correct transformation. event handlers, callback functions, and action procedures. All t Similarly, it is easy to see that sequences of nested synchronous ac- map to handlers in the general model and are used to specify di tivations can be readily optimized. The specific optimization tech- ent granularities of control. Event handlers, the most primitive niques and their limitations are discussed below in section 3. simply procedures bound to event names. Callback functions action procedures are more commonly used high-level abstract
    6. 6. Approach Event Profiling Optimization Techniques Graph Optimizations Compiler Optimizations Dealing with the Unexpected
    7. 7. Event Profiling EventGraph = ; prev event = eventTrace firstEvent; while not (end of eventTrace) event = eventTrace nextEvent; if (prev event,event) not in EventGraph EventGraph += (prev event,event); EventGraph(prev event,event) weight = 1; else eventGraph(prev event,event) weight++; prev event = event; Figure 4: GraphBuilder algorithm. havior of their external environment, e.g., the user’s actions. We
    8. 8. 1 42 Sample 31 ControllerFired 479 40 310 MsgFrmUserH 1 1 1 87 392 1 SendMsg MsgFrmUserL ControllerFiring 393 1 87 391 1 2 86 SegFromUser SegmentSent 1 160 1 Adapt 552 86 2 391 1 SegmentTimeout Seg2Net 8 42 ControllerClkL actions. We SegmentAcked 317 38 47 1 391 cant amount Controller 391 1 26 ControllerClkH xploited for 1 o levels. At AddSysInput 1 1 ll (or most) 1 Open ResizeFragment 1 n more than dlers are ex- Synchronously Activated Events rs are gener- Key: nd the over- Asynchronously Activated Events d at runtime. m’s behavior Figure 5: Event graph generated from video player ndler config-
    9. 9. Graph Optimizations(1) Handler Merging 392 MsgFromUserL MsgFromUserH Event A 393 310 SegFromUser ControllerFired 552 Threshold = 300 479 Seg2Net ControllerFiring 317 391 Controller Adapt E 391 391 ControllerClkH 392 ControllerClkL Figure 6: Reduced event graph
    10. 10. Graph Optimizations(2) Event Chains and Subsumption 392 Events Handlers romUserL MsgFromUserH Event A 93 310 Handler1 Handler2 Handler3 ser ControllerFiredHandler Graph View { { { H1_code H2_code H3_code FEC SFU1 FEC SFU2 } } } Event Graph Threshold = 300 479 SeqSegSFU TD S2N SegFromUser ControllerFiring Handler Merging 391 TDriver SFU FEC S2N Seg2Net Adapt S2N PAU WFC S2N Events Handlers Event A Handler123 { 91 391 H1_code H2_code 392 Figure 8: Subsuming events } H3_code rollerClkH ControllerClkL ure 6: Reduced event graph from within a handler for SegFromUser, the latter will wait un- Figure 7: Handler merging til the handling of Seg2Net has been completed, at which point control will return to the handler for SegFromUser. In this case, the handler for Seg2Net can be subsumed into that for SegFro- e event graph foreliminating the synchronous im- raisetranslates into a sequence of indirect function calls. There are two mUser, thereby a video player application event between fthem. a configurable transport protocol called CTP
    11. 11. Compiler Optimizations Function Inlining Redundant Code Elimination
    12. 12. Total Execution Time (sec) Frame rate Orig. ( ) Opt. ( ) Experiment Results 10 15 20 25 43.1 30.9 24.5 23.9 41.9 30.3 22.1 21.3 97.2 98.0 90.2 89.1 Key: Orig: Original program; Figure 10: Video player Push time ( sec) Size Orig. ( ) Opt. ( ) (%) O Total Execution Time (sec) Event Handler Time (sec) Event Processing Time ( sec) 241 64 274 Speedup 88.0 Frame rate Orig. ( ) Opt. ( ) (%) Orig. ( ) Opt. ( ) (%) Original 287 128 Optimized 263 ( ) 91.6 10 43.1 41.9 97.2 2.3 0.9 39.1 256 304 273 89.8 Adapt 55 11 80.0 15 30.9 30.3 98.0 1.6 0.6 37.5 512 336 299 89.0 SegFromUser 346 41 88.2 20 24.5 22.1 90.2 1.5 0.5 33.3 1024 430 373 86.7 Seg2Net 137 37 73.0 25 23.9 21.3 89.1 1.5 0.5 33.3 2048 572 552 96.5 Key: Orig: Original program; Opt: Optimized program Figure 11: Event processing times in the video player. Figure 12: Impact of optimiza Figure 10: Video player optimization results. Push time ( sec) optimization on overall execution time becomes more pronounced Processing Orig. ( sec) Opt. ( ) Time ( ) Speedup that the time for thetime (portion is reduced markedly in most Event Pop push sec) cases, Execution Time ( sec) Size (%)with improvements of ( to 13.3%. The improvements Orig. ( ) Opt. up ) (%) as thein the pop increases ) that when) the frame rate is low, the frame rate Orig. ( is Opt. ( Type (%) Original Optimized ( ) 64 274 241 88.0 portion397 also noticeable although not as high as CPU is idle a large part of the time. As a result, the unoptimized are 378 95.2 for the push por- Scroll 158 148 93.7 55 128 28711 80.0 263 91.6 tion, typically around 5% but going as high as 12%. 460 448 97.4 program can simply use a bit more of the idle83.8 to keep up with time r 346 41 88.2 Popup 37 31 256 304 273 89.8 484 457 94.4 the required frame rate. However, when the frame rate increases, An examination of the effects of our optimizations on these two 137 37 73.0 512 336 299 89.0 programs indicates two main sources of benefits: both programs must do more work in a time unit and the idle time 494 470 95.1 the reduction of Figure 13: Optimization of X events 1024 430 373 ent processing times in the video player. 86.7 argument marshaling overhead when invoking event handlers, When the frame rate becomes high enough, the unopti- 608 570 93.8 decreases. and 2048 572 552 96.5 handler merging that leads to a reduction in the mized program runs out of extra idle time and starts falling behind 1016 893 87.9 number of han- dler invocations. The elimination of marshaling overhead seems to the optimized program. This indicates that our optimizations are Figure 12: Impact of optimization largest effect on the overall performance improvements have the in SecComm achieved. The main effect of handler merging especially effectivePopup by over 16%.such as techniques were that aboutto reduce that of for mobile systems These handheld PDAs is 6% and the rall execution time becomes more pronounced number of function calls between handlers that are executed in less powerful processors than desktop systems. tend to havese- level of action handlers, although it would be applied here at the op creases is that when the frame rate is low, the quence. Merging also creates opportunities for possible to optimize configurable secure opening up callbacks in that additional code im- a one step further by communication service SecComm is part of the time. As a result, the unoptimized provements due to standard compiler optimizations. sup Execution Time (the idle time to keep up with use a bit more of sec) theallowsway. customization of security attributes for a communica- same the eve Codeifin event handlersevent B changed event binding for is usually a small fraction connection, including performed using the Athena widget tion of the total Orig. ( ) Opt. ( ) (%) rate. However, when the frame rate increases, These optimizations were privacy, authenticity, integrity, and non- program size. To measure the effects B); our optimization on code eve call(original code for event of repudiation. One of the features of SecComm is its support for do 158 work in a time unit93.7 the idle time more 148 and size, we counted the number of instructions in family based on Xt and Xlib provided with XFree86. The Athena else the original and op- can e frame rate becomes high enough, the unopti- 37 31 83.8 timized programs using theand optimized code for event is a minimalsecuritywith limited configurability, and there- se- merged, inlined, command objdump implementing a toolkit property using combinations of basic toolkit program | -d B; out of extra idle time and starts falling behind fore provided limited scope for applying configuration of SecComm curity micro-protocols. We optimized aour optimizations. The wc -l. Our optimizations produce a code size increase of 1.3% mi am. Optimization of X our optimizations are 13: This indicates that events for the video player and 1.1% for SecComm. event model in more recent (and popular) toolkits such message body with three micro-protocols, two of which encrypt the as Gnome all for mobile systems such as handheld PDAs that