Scheduler Activations ===================== * What is the goal of this work? Functionality of kernel threads with performance and flexibility of user-level threads. * What's wrong with user-level threads? - A blocking system call blocks all threads. What about select? Doesn't work for all syscalls. - A page fault blocks all threads - Hard to run as many threads as CPUs. Why? - Don't know how many CPUs - Don't know when a thread blocks - Deadlock? (p. 59 bottom) * What about kernel threads? - Handle blocking syscalls/page faults well - Adds many user/kernel crossings--expensive Thread switch, create, exit, lock, signal, wait, ... On Pentium III: getpid: 365 cycles, fn call: 7 cycles - Typically 10x-30x slower than user threads * User-level threads multiplexed on kernel threads? - Different apps have different needs (thread priorities, etc.) - Kernel doesn't know best thread to run - Kernel doesn't know about user-level locks - priority inversion (preempt while in critical section) - too much info changing to quickly to notify kernel - Hard to keep name number of kthreads as CPUs - Neither Kernel nor user knows how many runnable threads - User doesn't even know number of CPUs available - Can even have deadlock! Example: One uthread is an NFS loopback server Another uthread causes a dirty buffer to be flushed Second uthread blocks kthread--if only 1 kthread, bad! * How do scheduler activations address the problem? - Let user program schedule threads (most thread ops just a function call) - Run same number of threads as you have CPUs (know exactly which threads you can run and which are in blocking syscalls or page faults) - Minimize number of user/kernel crossings * What is a scheduler activation? - Virtual CPUs - Always begin execution in user scheduler - User scheduler keeps activation to run a thread - Preempted by kernel, but never directly resumed - How many scheduler activations does a process need? - One for each CPU - One for each blocked thread. Why? Kernel might need its stack when blocking op completes - When must kernel call into user-space - New processor available - Processor had been preempted - Thread has bocked - Thread has unblocked - When must user call into kernel? - Need more CPUs - CPU is idle - Preempt thread another CPU (for higher priority thread) - Return unused scheduler activation for recycling (after user thread system has extracted necessary state) - How does this compare to # of u/k crossings with user-level and kernel-level threads packages? * What happens during preemption (in detail) * How does kernel notify process when taking away last CPU? - Delays until process rescheduled - Why not just resume the last preempted scheduler activation? - Might be on different CPU, messing up cache affinity - Application might want to service high-priority timeouts - Point is: Give user-level thread system all the information * What if a preempted scheduler activiation is in a critical section? - Does this matter? - Might kill performance if holding spinlock - Could hold lock on ready list -> deadlock - How to deal with this? - Detect thread in critical section - Finish critical section (function copy returns to scheduler) - What if critical thread blocked in page fault? - Performance might be suboptimal, but at least correct - What if scheduler activation entry point causes page fault? - Create infinite # of scheduler activations? - Kernel checks for this special case and resumes activation * What abstractions besides threads might you build on scheduler activations? * Could libasync use scheduler activations? - Can't just preempt threads and have concurrency - data races - Could continue blocked thread in different scheduler activation Fake EAGAIN return from syscall? Register event ready when thread gets resumed - Basically a way to construct non-blocking API from blocking * Evaluation of paper - Statement of claims? - Evaluation of functionality? - Performance evaluation?