Your SlideShare is downloading. ×
Scheduler Activation Study Paper
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Scheduler Activation Study Paper

48
views

Published on

Published in: Software, Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
48
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism Ashay Thool Department of Computer Engineering Santa Clara University Santa Clara, CA, USA. Email – athool@scu.edu. Tushar Patil Department of Computer Engineering Santa Clara University Santa Clara, CA, USA. Email – tpatil@scu.edu. Abstract— Processes are well suited for multiprogramming in a uniprocessor environment but they are simply too inefficient for general-purpose parallel programming in modern multiprocessor environment; they handle only lower degree of parallelism well that’s why threads are best suited for concurrency in many approaches to parallel programming. The shortcomings of traditional processes have led to the use of threads for general purpose parallel programming. Threads give control of parallelism sufficiently cheap that the programmer or compiler can exploit even fine-grained parallelism with acceptable overhead. Threads can be supported either by the operating system kernel or by user- level library code in the application address space. we can say that the performance of kernel threads is inherently worse than that of user-level threads. We thus argue that managing parallelism at the user level is essential to high-performance parallel computing. Next, we will talk about lack of system integration: This issue exhibited by user-level threads is a because of the lack of kernel support for user-level threads provided by contemporary multiprocessor operating systems; Finally, we describe the design, implementation, and performance of a new kernel interface and user-level thread package that together provide the same functionality as kernel threads without compromising the performance and flexibility advantages of user-level management of parallelism. Keywords— Multiprocessor, Threads. I. INTRODUCTION (Heading 1) A.User-level threads: User-level threads are application specific and are managed by runtime library routines linked into each application so that thread management operations require no kernel intervention. The result can be excellent performance because of no kernel overhead. User-level thread systems are typically built without any modifications to the underlying operating system kernel so they are flexible and do not depend on underlying OS. You can implement user-level-thread in number of ways and specific to requirements of each application. The thread views each process as a “virtual processor”, and treats it like a physical processor executing under its control. In reality, though, these virtual processors are being multiplexed across real, physical processors by the underlying kernel. B.Kernel Level Threads Kernel level threads are directly scheduled by kernel (not by user level thread scheduler) hence, events like IO, page faults is handle correctly. A thread blocked does not affect others. Identify applicable sponsor/s here. If no sponsors, delete this text box (sponsors).
  • 2. II. PROBLEM A.User-level-threads The “virtual processor” concept sounds really good but as its name suggests it’s virtual, this approach is very good but only when everything goes well. Many “real world” factors, I/O, page faults affect the equivalence between virtual and physical processors; so when this happens, user- level threads built on top of traditional processes can exhibit poor performance or even incorrect behavior (inconsistent). lack of system integration issue exhibited by user-level threads is because of the lack of kernel support for user-level threads provided by contemporary multiprocessor operating systems; It means Kernel don't know what is going on. e.g think about when a thread block (page fault, IO) or when a thread is pre-empted while holding a lock. To counter this problem, user-level thread system will often create as many kernel threads to serve as “virtual processors” as there are physical processors in the system; each will be used to run user-level threads. When a user-level thread makes a blocking 1/0 request or takes a page fault, though, the kernel thread serving as its virtual processor also blocks. As a result, the physical processor is lost to the address space while the 1/0 is pending, because there is no kernel thread to serve as an execution context for running other user-level threads on the just-idled processor. Cost of creating and managing parallelism is high because as it is a user program you are investing lot of effort in developing them and need resources like developer which adds to the cost B.Kernel-level-threads With kernel threads, the program must cross an extra protection boundary(to protect it from any attacks or crash) on every thread operation, even when the processor is being switched between threads in the same address space. This involves not only an extra kernel trap, but the kernel must also copy and check parameters in order to protect itself against buggy or malicious programs. By contrast, invoking user-level thread operations can be quite inexpensive. Further, safety is not compromised: address space boundaries isolate misuse of a user-level thread system to the program in which it occurs. The cost of generality is also significant as kernel-level-threads are not flexible, you cannot make them application specific. There is a solution a to support multiple parallel programming models, but it involves modifying the kernel, which increases complexity, overhead, and the likelihood of errors in the kernel. although kernel threads are better than processes, but worse than user-level threads. III. APPROACH We have to achieve best of both worlds which is of user-level-threads and kernel-level- threads and create new thread management system with kernel functionality combined with user level threads. Functionality is same as kernel-level-threads and pperformance and flexibility similar to user-level-threads. Our approach provides each application with a virtual multiprocessor, which is an abstraction of a dedicated physical machine.Each application knows exactly how many (and which) processors have been allocated to it and has complete control over which of its threads are running on those processors. The operating system kernel has complete control over the allocation of processors among address spaces including the ability to change the number of processors assigned to an application during its execution. To achieve this, the kernel notifies the address space thread scheduler of every event affecting the address space.
  • 3. The kernel’s role is to vector events that influence user-level scheduling to the address space for the thread scheduler to handle, rather than to interpret these events on its own, as in a traditional operating system. Also, the thread system in each application’s address space notifies the kernel of the subset of user level events that can affect processor allocation decisions. By communicating upward all kernel events, functionality is improved because the application has complete knowledge of its scheduling state. By communicating downward only those events that affect processor allocation, good performance is preserved, since most events (e.g., simple thread scheduling decisions) do not need to be reflected to the kernel. The kernel mechanism that we use to realize these ideas is called scheduler activations. IV. SCHEDULER ACTIVATION The term scheduler activation was selected because each vectored event cause user level thread system to reconsider its scheduling decision of which thread to run on which processor. It serves as a vessel or execution context, for running user level threads, in exactly same manner the kernel thread does. It notifies the user level thread system of a kernel event. It provides space in kernel for saving the processor context of the activation's current user level thread when the thread is stopped by the kernel e.g. when kernel thread is on I/O gets block or the processor being preempted by kernel. With the use of scheduler, the invariance maintain by the kernel is that for processors assigned to address space there are as always exactly as many running scheduler activation. The goal is to mimic the functionality of kernel threads which will help in to gain performance of user space threads. It avoids unnecessary user/kernel transitions within memory during the upcalls some of them are shown below in Table I. Table I lists the events that the kernel vectors to the user level using scheduler activations; the parameters to each upcall are in parentheses, and the action taken by the user-level thread system is italicized. Note that events are vectored at exactly the points where the kernel would otherwise be forced to make a scheduling decision. In practice, these events occur in combinations; when this occurs, a single upcall
  • 4. is made that passes all of the events that need to be handled. In Fig.1, it illustrates what happen on an I/O request/completion. Considering this uncommon case; threads can be created, run and completed, all without kernel intervention in normal operation. Each pane in Figure 1 reflects a different time pane. Straight arrows represents scheduler activations, s-shaped arrows represents user level thread, and the cluster of user level threads, and cluster of user-level threads to the right of each pane represents the ready list. At T1, the kernel allocates the application two processors. On each processor the kernel upcalls to user level code that removes a thread from the ready list and starts running it. At T2, one of the user thread 1 blocks in kernel. To notify at user level event, the kernel takes processor that had been running thread 1. The kernel performs fresh upcall for scheduler activation and the processor is assigned to another thread in the ready list. It starts running it. At T3, An I/O request is completed for thread 2. The kernel wants to notify at user level event. But it requires processor for that notification. The kernel preempts one of the processor running in the same address space. After that it performs upcalls. This put both thread 1 and thread 2 back in ready list. At T4, the upcall takes a thread of the ready list and starts running it.
  • 5. V. POLICY FOR PROCESSOR ALLOCATIONS The mechanism described in the last subsection is independent of the policy used by the kernel for allocating processors among address spaces. Reasonable allocation policies, however, must be based on the available parallelism in each address space. In this subsection, we show that this information can be efficiently communicated for policies that both respect priorities and guarantee that processors do not idle if runnable threads exist. These constraints are met by most kernel thread systems; as far as we know, they are not met by any user-level thread system built on top of kernel threads. There are no processors remains idle if the there is some thread running at same address space. The processor are divided evenly among address spaces. At user level, the kernel gets notified under 2 circumstances: a. When number of runnable threads are more than processor. b. When number of processors are available are more than the runnable threads. The kernel uses above hints to manage processor allocation to the threads at user level. Ofcourse, the user level thread system must serialize its notification to the kernel sine ordering matters. VI. CRITICAL SECTIONS There is some serious issues when user level thread is executing in critical sections. What happens at that instant when it is blocked or preempted? There are two approaches to overcome this problem - a. Prevention – In this method, the kernel yield its control over processor allocation (at least temporarily) to the user level. But it violates the semantics of address space priorities. b. Recovery – This method is deadlock free. It checks for thread preempted or unblocked and temporarily continues at user level context switch. We ensure that once a lock is acquired, it is always eventually released, even in the presence of processor preemption or page faults. VII. PERFORMANCE EVALUATION To illustrate the effect of our system on application performance, we measured the same parallel application using Topaz kernel threads, original FastThreads built on top of Topaz threads, and modified FastThreads running on scheduler activations. Thread performance without Topaz kernel is similar to the FastThread package before changes. The figure 1 represents Topaz kernel thread initially perform well and then flattens out. But the case for original FastThreads and NewFastThreads performance diverges slightly with 4 or 5 processor. Our system explicitly allocates processor to the address space. The topaz thread cause preemption only when there is no idle processor available. In Figure 2, it shows the effect on the performance of application induced kernel event. Such kernel events results in our system having better performance than either original FastThreads or Topaz threads. Performance is worse with Topaz threads than with our system because common thread operations are more
  • 6. expensive. In addition, because Topaz does not do explicit processor allocation, it may end up scheduling more kernel threads from one address space than from the other. Table 5 shows that application performance with modified FastThreads is good even in a multiprogrammed environment; the speedup is within 5% of that obtained when the application ran uniprogrammed on three processors. This small degradation is about what we would expect from bus contention and the need to donate a processor periodically to run a kernel daemon thread. In contrast, multiprogrammed performance is much worse with either original FastThreads or Topaz threads, although for different reasons. VIII. CONCLUSION Our prototype implements threads as the concurrency abstraction supported at the user level, scheduler activations are not linked to any particular model; scheduler activations can support any user-level concurrency model because the kernel has no knowledge of user- level data structure. Our approach derives conceptual simplicity from the fact that all interaction with the kernel is synchronous from the perspective of a single scheduler activation. A scheduler activation that blocks in the kernel is replaced with a new scheduler activation when the awaited event occurs. REFERENCES [1] Scheduler Activations: Effective Kernel Support for User-Level Management of Parallelism Thomas Anderson, Brian Bershad, Edward Lazowska and Henry Levy. Scheduler Activations: Effective Kernel Support for User-Level Management of Parallelism. ACM Transactions on Computer Systems, vol. 10, no. 1, February 1992, pages 53 - 79. [2] http://pages.cs.wisc.edu/~swift/classes/cs73 6- sp07/blog/2007/03/scheduler_activations_ef fectiv.html [3] http://people.freebsd.org/~deischen/docs/p9 5-anderson.pdf [4] http://people.freebsd.org/~deischen/docs/Sc heduler.pdf [5] http://homes.cs.washington.edu/~tom/pubs/s ched_act.html [6] http://citeseerx.ist.psu.edu/viewdoc/summar y?doi=10.1.1.13.9310 [7] http://courses.cs.washington.edu/courses/cse p551/04wi/Messages/paper12/0001.html [8] http://www.coyotos.org/pipermail/coyotos- dev/2006-January/000389.html