Nooks
=====

What is the problem this paper is addressing?

Why drivers?  Does this seem like a viable approach?
  How does it compare to virtual machines?
    challenge is virtualizing kernel/driver interface, not hardware
  What alternative approaches might one take?
    Code checking tools -- static analysis, or run-time tools (like eraser)
    Safe languages
    Software fault isolation (SFI) kind of like safe languages
  How might you combine Nooks with previous approaches?
    E.g., SFI saves you from context switch & TLB misses
          but still have to track object usage

Nooks is trying to find a new design point between unprotected and safe:
    - fault resistance, not fault isolation
    - design for mistakes, not abuse
  What are the benefits and drawbacks of this approach?
    + works with today's code
    + performance impact possibly less than, say, full microkernel
    - not 100% effective
    - only applies to extensions, not core kernel (unlike automated checking)

What are three goals of the system:
  Isolation - don't let fault in one extension infect rest of system
  Recovery - support automatic recovery after a function
  Backwards compatibility - e.g., work with Linux

How does vanilla Linux deal with bugs/assertion failures in kernel
  Make a distinction between running in process vs. interrupt context
  Process context - kill process
    Note this isn't quite "fair", because bug is kernel bug, not user bug
    But allows some degree of recoverability
  Interrupt context - crash & reboot machine.  Why?

How does Nooks achieve Isolation?
  Use paging hardware to protect kernel & extensions against bad extensions
    See Fig 3:  Kernel can write everything, extensions only write themselves
  Use Extension Procedure Call (XPC)
  What do you have to do to call into an extension?  (Fig. 4)
    Copy any argument data structures to where extension can write them
    Might need to follow/adjust any pointers in data structures
    Adjust stack pointer
    Load %cr3 with address space of extension
    === run extension
    Switch %cr3 and stack back
    Copy results back; synchronize any modified structures
  What about modifications to non-argument kernel data structures?
    Fortunately, happens often to be done through macros and inline functions
    Can change these into XPCs
  Where do page tables come from when loading %cr3?
    Nooks has to maintain a set of "shadow" page tables
      Just change code where linux touches page tables
      Have to modify page fault handler... how?
    Current task (Linux equiv of proc) structure on kernel stack?
    Could you optimize this process on the x86?
      If extensions are in different 4 MB regions... maybe re-use page tables
      (Just clear PTE_W in page directory entry)
      Or at least do this for some regions (might not work for buffer cache)
      Also, maybe targeted TLB flush in stead of %cr3 load?
  What is deferred XPC mechanism?  Where/why does this come in?

What are wrappers?  How do they work?
  Three purposes:
    Check parameters for validity
    Implement call-by-value-result?  (What's this vs. call by reference?)
    Perform XPC
  Basically works through linker
  Who writes a wrapper?
    Tool auto-generates skeleton from header
    Fill in by hand
  Need to know properties
  How specific to each extension is the wrapper?  See Fig. 5

What is Object Tracking and why?
  Records address/type of all objects in use by an extension
    If used for call, just attach to stack
    If held, keep in per-extension hash table
  If ext. might write object, keep association between kern & ext. versions
  How do you know lifetime of objects?
    By hand inspection - determine type of object
      passed in for call, allocated/deallocated by ext., special (timer), ...
  Do you always copy objects?
    No... more efficient just to re-map network & disk buffers

How do you detect a fault?
  Easy cases... page fault or other exception in extension
  What about harder cases... e.g., no network packets received
    User can detect and initiate recovery

How do you recover from a fault?
  - Disable any interrupts vectored to the extension, if driver
     (what if you didn't do this... could get livelock or worse)
  - Invoke user-mode recovery agent
     Perform extension-specific recovery, notify sysadmin,
     Change configuration, disable after repeated failures, ...
     By default, unloads and re-loads module
  What's this about interruptable vs. non-interruptible state?
  What about allocated memory?  (This is why we need object tracking)
  What about things like network buffers w. pending DMA?
    Only free buffers after re-loading driver
    after it has re-initialized the device

How could an extension bypass Nooks to corrupt system?
  Set %esp to something bad and take an exception (what happens on x86?)
  DMA to physical memory you can't write
  move something to %cr3 (after all, XPC mechanism does this)
  disable interrupts and loop forever
  logic bugs that don't involve trashing memory

How do you evaluate something like this?
  Care if it achieved backwards compatibility
  Care about whether it improves Reliability, and cost in Performance

How backwards-compatible is Nooks?
  One time costs
    Basic kernel changes (e.g., to update shadow page tables)
    Need to implement base Nooks functionality  (object tracking, XPC, etc.)
    Need to write wrappers for various types of extensions
    See Table 2 for idea, non-wrapper code is only about 8,000 lines
  Per-extension costs
    Need some driver-specific wrappers
    Need to re-compile extensions
    Sometimes need to modify extensions--when?
      If directly modifies kernel data structure w/o using function/macro
      kHTTPd was only one of their extensions that required this (in 13 places)

How to measure reliability?
  Fault injection.  What did they do (see journal paper)?
    Automatically changes single instructions
      Emulate programming errors:
        - source & destination faults emulate assignment errors
        - pointer faults emulate bad pointer calculations to corrupt memory
        - interface faults emulate bad parameters
        - branch faults remove branch conditions
        - loop faults change termination condition of loops
      Other random changes
        - text fault: flip a random bit in some instruction
        - NOP fault: delete a random instruction (change it to nop)
  Is this realistic?
  How do results look?
  Too optimistic or pessimistic?

Performance... Let's look at table 4:
  Play-mp3 looks good
  Why does send-stream have more XPCs than receive stream? (batching)
    Why does this not matter for performance? (cost overlaps w. xmit)
  Why does compile take bigger hit than send-stream (which has more XPCs)?
    Compile is CPU-bound
    How did they produce graph in Figure 8?
      What is statistical profiling?  What does this tell us?
      Why don't they show user-mode execution time?
    Where is CPU time going?
      Extra code -- e.g., XPC, object tracking
      Existing code running more slowly?
        Why?  TLB misses; What are "Pentium 4 performance counters"?
  Why does khttpd do so much worse under Nooks?  (60% worse, ouch)
    CPU problem, like compile
    Also, transactional, not buffered...  how does this affect things?
    Do we care?  khttpd does sound like a bogus project
      Maybe use special OS if you care so much (more next lecture...exokernel)

Would same ideas apply to other OSes?
  Authors claim Linux is worst case scenario?  Why?  Do we believe this?
    In terms of lots of ill-defined extension interfaces, probably true
    That linux doesn't reboot on process-context panic might help, though