Synthesis (1988) ================ What is the motivation of this paper? Want general-purpose OS, but don't want performance penalty for generality Common case does not require full generality, so generate simpler code! Two key ideas: Synthetic machine, and code synthesis What is synthetic machine interface? Four abstractions: Synthetic machine Synthetic CPU Synthetic Memory Synthetic I/O Six operations: create terminate reconfigure query read write What is code synthesis? Generates code on the fly to deal with system calls In particular, create generates the code for other calls Why can synthesized code run faster than, say, Unix system calls? Factoring Invariants Many constants known at the time code is synthesized: current PID, address of kernel buffers, device file resides on, ... Collapsing Layers Can combine layers of abstraction--e.g., TCP and IP Inlining procedures avoids procedure calls, possiby context switches Also allows further opportunities for factoring invariants Executable Data structures E.g., make queues self-traversing Have startjob/stopjob routines, where stopjob branches to next job's start What about potential problems? Inflated kernel size because of code redundancy? Store rarely-used functions in common area Decision must be made by programmer of create function Code correctness with more complicated kernel structure? Try to keep Unix-like structure on some level So open is functionally like Unix just happens to generate code instead of filling in file table Protection? Use memory protection; protect synthesized code like rest of kernel Use jump table; can't jump into the middle of synthesized code Example: Compare appendices A & B Reducing synchronization (from later SOSP paper) Code isolation Separate threads access separate data For example, only a thread itself updates its own thread table entry Procedure chaining Execute sequentially instead of concurrently Example: after interrupt, change interrupted procedure's return address on stack to jump to interrupt handler Optimistic synchronization Structure so in common case no synchronization needed Example: single producer, single consumer queue Use circular buffer, one thread updates Q_head, other Q_tail Only need to wait if buffer is full or empty (Note: assumes sequential consistency from memory system) Example: multiple producer, single consumer queue Use compare and swap to increment Q_head atomically After incrementing Q_head, fill in queue elements you have claimed Use bool vector to signal which queue elements are ready Consumer waits if queue elements not ready in bool vector Scheduler (from Massalin's thesis) First, discuss hardware phase-locked loop (PLL) E.g., how to multiply the frequency of a signal by N reference signal---counter---------------| phase | +--counter--divide by N--| comp. |--DAC | | | |voltage- | | |controlled|---+--out | |oscillator| | | | +-----------------------------------------------+ Idea: use negative feedback (too fast makes output slow down) In software, can use idea to implement Frequency-locked-loop FLL - Count events per second Interval-locked-loop ILL - Measure interval between events Is this the same thing? Will converge to same rate But when conditions change, errors will be different FLL keeps # of events correct over time, but intervals vary ILL keeps intervals roughly correct, but might lose events Example: Event happens once a second, can measure with 1 usec granularity FLL will take many seconds to adjust to error ILL will adjust very quickly, by measuring actual interval Example: Event happens 30 times/sec, timer granularity 1/30 of a sec FLL independent of timer granularity, can stabilize in a few secs ILL will have lots of error Note, can also apply filters to stabilize. FLL example (want i2 to execute 4 times as often as i1 interrupt): i1 () { residue += 4; freq += Filter (residue); ... } i2 () { residue --; freq += Filter (residue); ... } Filter(int x) { static int lopass; return lopass = (7*lopass + x)/8; } In addition to lopass, can combine with Integrate and Deriv filters These principles used to implement audio processing E.g., read from cd, write to speaker Scheduled at exactly the right rate, regardless of CPU-intensive jobs Synthesize drum beats using ILL Quiz Review =========== Remind people of lectures so far: 1. Intro 2. PC hardware 3. VM System calls, Appel & Li 4. MULTICS 5. Plan 9 6. Scheduling, SMART 7. Threads, Scheduler Activations 8. Disk hardware, FFS 9. Receive livelock 10. rc and es 11. L3 12. Spin 13. Exokernel 14. Synthesis Principles encountered Have to implement a system to evaluate it (MULTICS) Implement prototype components to get a system up and running (MULTICS) Find simple, general-purpose mechanisms that support many uses (Unix, Plan9) Don't invest work in something that will be useless anyway (Livelock, SMART) Use feedback in apportioning resources (Synthesis, Livelock) Trade-off between performance & functionality (L3, Spin, Exokernel, Synthesis) Expose information to applications to help resolve trade-off (Exokernel/Scheduler Activations) Expose hardware resources to help applications (Exokernel, Appel & Li) Ask for questions on papers