Quiz correction: question 6D could be true
Vote on extra lecture:  Synthesis vs. Singularity
Events, handlers, and guards in SPIN (didn't get to last time)

Exokernel
=========

Many papers published about new OS ideas
  Fancy schedulers
  Better thread systems
  Better I/O prefetching and caching
  Useful VM primitives & implementations
  But many of these ideas don't have impact.  Why?  hard to modify OSes

Many OS abstractions come with trade-offs:
  E.g., speed vs. generality
  No single choice is optimal for all applications
  Examples (should be familiar by now)?
    Databases & garbage collectors interact badly with LRU paging

Hence, third in series of papers on extensible OSes
  L3 - make microkernels viable with fast IPC
  SPIN - put extensions in kernel, avoiding IPC to servers
  Exokernel - put extensions in *application/library*, avoiding servers

What is the Exokernel architecture?
  Basic idea:  Separate protection from management of resources  Why?
      (...or, as stated in previous paper:  "Exterminate all OS abstractions")
    Applications may know better how to manage resources
    In fact, applications may benefit from knowing what resources they have
      Paging, buffering often *hide* information from apps
  What this means in practice:
    Expose allocation
    Expose names
    Expose revocation

What is the end-to-end argument?  How does this apply to Exokernel?

Main approach is based on three techniques (p. 1)  What are these?
  Secure bindings - decouple authorization from use of a resource
    Hardware mechanisms
    Software caching
    Downloading application code
  Visible resource revocation--what is this and why?
    Ask the application to chose which resource to give up
    E.g., when looping through a file, want to evict MRU, not LRU
  Abort protocol--what's this
    When your more friendly revocation upcall takes too long
    Load "repossession vector" with resources for drastic situations
      (e.g., disk blocks where pages can be written)

What does the environment abstraction consist of in Aegis (p. 8)?
  Exception context (think sys_env_set_pgfault_upcall in JOS)
    For each exception (e.g., page fault), entry point & where to save regs
  Interrupt context
    Same for intrerupts (e.g., when time quantum has expired)
    Note also has "interrupt enable" flag, to disable interrupts
      In JOS, exception state is saved on stack, so can fault recursively
      Aegis stored stuff at fixed addresses in lower 64K
      So potentially nowhere to store stuff in recursive interrupt
  Protected entry context
    Entry point for IPC from other processes
  Addressing context
    Small number of guaranteed page mappings
      (so your user-level TLB fault handler is always present)

How is physical memory multiplexed in Aegis?
  Go over what MIPS VM looks like:
    Hardware has 64-entry TLB
      References to addresses not in TLB trap to kernel
    Each TLB entry has the following fields:
      Virtual page, Pid, Page frame, NC, D, V, Global
    Kernel itself unpaged
      All of physical memory contiguously mapped in high VM
      Kernel uses these pseudo-physical addresses
    User TLB fault hander very efficient
      Two hardware registers reserved for it
      utlb miss handler can itself fault--allow paged page tables
  How does Aegis's VM interface work on page faults (see p. 9)?
    Application VM divided into two segments (why?)
      Segment 1 normal, Segment 2 may contain guaranteed mappings
      Why?  To make take fast path for common case of segment 1
  How is performance of VM?  Look at Table 10
    Why are prot100 and unprot100 slower on Aegis?
      Two data structures may have to be updated, page table & STLB
      Maybe also "immaturity of the implementation"

How is processor multiplexed?
  Round robin; allocate slots
  How did the stride scheduler implementation work?  (sec 7.3, p. 11)

How is the network multiplexed?
  DPF - dynamic packet filter
  Is regular scheduler good enough to process received packets
    No, high-latency.  Why is this important for protocols like TCP?
  ASHes.  What is motivation for ASHes?
    Direct, dynamic message vectoring
      No need to copy to intermediary kernel buffers
    Dynamic integrated layer processing (ILP) -- e.g., checksum while copying
      Used a pseudo-assembly language to specify this
      In setting ash you would specify regular code + some ILP code
    Message initiation -- e.g., send TCP ack immediately
    Control initiation -- e.g., create/activate thread, acquire lock, etc.
  Does this work?  Look at figure 2
    Good.  Other benefits besides latency:
      don't have to pre-specify buffers, etc.

Protected control transfer (sec 5.5, p.9)
  Synchronous vs. asynchronous (rescheduled next time quantum)
  Note:  No access control--how would you implement in library?
    Not necessarily obvious without some notion of identity...
  How is performance?  Table 6 looks good.  Is this meaningful?
    Scaling L3 is a little bogus
    IPC stresses aspects of CPUs that don't improve with MIPS

How does IPC work?
  Built on protected entry context of environment
  What's going on in Table 8?
    pipe - passes word in circular shared-memory buffer
    pipe' - same as above, but inline calls to read and write (only on Aegis)
    shm - relies on directed yield to bump counter in shared memory
    lrpc - do RPC to increment counter

What is the plan for revoking resources?
  Expose information so that application can do the right thing.
  Ask applications politely to release resources of a given type.
  Ask applications with force to release resources

What are the examples of extensibility in this paper?
  RPC system in which server saves and restores registers (Table 12)
  Different page table, and stride scheduler

How would you do buffer cache?

How you do sleep/wakeup on various events?

How would you do file system?

Some cool exokernel hacks from later on:
  Fast, simple binary emulation of other OSes
    Emulator runs in same address space as process
    System call ints vectored back to user space
    Therefore, emulation can actually be faster than original OS (e.g., getpid)
      In general, emulator can avoid expensive checks, because trusts app
  XCP - highly optimized file copy
    All writes can be delayed until you attach file tree at end of copy
  Cheetah - highly optimized web server
    Special files system keeps TCP checksums in with data
    File system co-locates files like images based on HTML grouping
    Combine the disk buffer cache with the TCP retransmission cache
    Never need to stream data through processor cache (true 0-copy)
    Never need more than one copy of data in memory

Lessons learned in retrospect several years later:
  Exposing kernel data structures is a big win (e.g., for wake predicates)
  Exokernel interface design is hard
    Even before exokernel, things like scheduler activations not obvious
    DPF, buf cache, XN, wake predicates, all non-trivial
  Information loss can put libOSes at a disadvantage
    E.g., UNIX can implement LRU paging across applications
    Solution:  Exokernel can keep statistics, but leave interpretation to apps
  Provide space for application data in kernel structures
  Fast applications don't require good microbenchmark numbers
  Cheap critical sections useful
  User-level page tables were very hard
    E.g., when an ASH accesses VM, might need app-level fault handler
    Even w. kernel page tables, self-paging is complicated
  ASHes might not have been necessary
    Yes, upcalls are expensive, but maybe not that expensive
  Downloaded code is powerful
    But not so much because of performance reasons, like fewer upcalls
    Rather, because you can control and reason about the execution
      Check packet filters for conflicts, merge packet filters
      XN (file system) needs to know code is deterministic