Bugs as Deviant Behavior
========================

Plan for the rest of the semester:
  Last two lectures we looked at several high-security OSes
  Today and next two lectures we look at how to make OSes more correct
    Improving security but also reliability
  Then read a paper that claims OS security hopeless, do it in hardware
    You'll get my biased dissenting view on the topic
  Finally look at applying some OS ideas without completely re-writing kernel

Background for today's paper:  Metal language
  Allows you to load arbitrary code into the compiler
  Can be used to check correctness rules
    E.g., free (ent); ... return ent; -> compiler reports using ent after free
  Metal basically allows you to specify a state machine
    Go over Figure 2

Premise of paper is that programs often violate correctness rules
  Examples (from other paper):
  - Never call blocking functions in bottom half of the kernel
  - Check NULL pointers returned from routines
  - Do not allocate large variables on the kernel stack
  - Do not make inconsistent assumptions about whether pointer is NULL
  - Check array/loop bounds derived from user data
  - Release acquired locks, and do not double-acquire
  - Restore disabled interrupts
  - Do not use freed memory
  - Do not use floating point in the kernel
  - Check realloc result so as not to leak memory
  - Do not dereference untrusted user pointers
  - Allocate enough memory for type that is stored there

Previous approaches to finding such bugs
  Type systems
    + Might induce programmers to encode extra, valuable information
    - Have to re-write code
    - Can't encode some types of restriction (e.g., temporal)
  Specifications & formal verification
    + A formal proof gives you very high assurance
    - But it's very hard--can't prove everything
    - How do you know the code conforms to the spec?
  Dynamic invariant inference
    [Remember the Eraser lockset algorithm I mentioned in Threads lecture?]
    + Has proven effective at finding bus (in the case of eraser)
    - Have to actually execute the code
        How to test drivers w/o device?
	Even if you can hit code path, doesn't mean you will while testing

In previous work by these authors, found many violations of correctness rules
Problem:  How to find such rules if you don't understand the system?
  Real programs have 100s of unspecified rules
  Idea:  Try to infer the rules from the source code itself

Why should this work?  Digression... Law enforcement:
  What if a detective interrogates witnesses?
    Doesn't know what actually happened
    But if two people contradict each other, one must be lying
  Also if 100 people say X and 1 person doesn't, probably the 1 is incorrect
Same with code (p. 3 of paper):
  struct mxser_struct *info = tty->driver_data; // MUST: tty not NULL
  //...
  if (!tty || ...) ...  // Here tty may or may not be NULL

General approach for infering bugs:
  User supplies some TEPLATES
    Cannot use a pointer after passed to function <f>
    <a> must be paired with <b>
  <a>, <b>, and <f> are SLOTS
  Actual code that could fit in <a> is called a SLOT INSTANCE

Now how to find instances for the slots?
  One technique:  Statistical analysis
    Find lots of repeated idioms:
    lock(a); x++; unlock(a); ... lock(a); x++; unlock(a); ... 
      Then an unprotected x++ means MUST: x not protected by l
    If x almost always protected by lock(a), unprotected access probably a bug
       (MAY belief conflicts with MUST belief)
  Another technique:  Internal consistency
    Check for self-contradiction
    if (card==NULL)
      printk (KERN_ERR, "capidrv-%d: ... %d!\n", card->contrnr, id)
  In general can infer beliefs:
    x = *p;      // MUST:  programmer believes p is not NULL
                 //  MAY:  x may be protected by l
    free (p);    // MUST:  p is heap allocated
                 // MUST:  p not used any more
    unlock (l);  // MUST:  l acquired

In more detail, how to specify an internal consistency checker?  Need:
  1. A rule template T
      Ex: do not dereference null pointer <p>
  2. Valid slot instances for T
      Ex: any pointer is potentially a valid instance
  3. Code actions that imply beliefs
      Ex: dereference or compare pointers
  4. Belief combining and contradiction rules
      Ex: {null} and {notnull} => contradiction
  5. Rules for belief propagation
      Ex: testing implies that above you could be in either state
      Ex: when two paths after test join => {null, notnull}

Given checker, must:
  1. Find MUST beliefs
       Direct observation:  p = NULL; or if (p == NULL) { ... }
       Infer based on pre/post-conditions:
         free (p) -> means p was dynamically allocated
	 p = q -> means p and q might have been different
  2. Relate code
       By code path:  E.g., If a calls b, should have same fault model
       By abstrction:  If a and b conform to same interface
         a and b might be same functions in different versions of software
         a and b might be assigned to the same function pointer
         ith argument of and b should then have similar properties

How to specify a statistical checker?  Need:
  0. Optional pre-processing phase to reduce potential instances
  1-5. Same as internal consistency checker
    Ex: Template lock <l> protects variable <v>
  6. Additionally need a "rank" function
       Count how often a lot instance was checked, and how often failed
       Rank is function of these two counts
  Potential problems?
    Macros that check pointer for NULL -- why is this a problem?
    Functions that don't return
    A lot of code between redundant checks makes them OK.  Why?

Using statistical checker:
  Good function z(n,e) = (e/n - p0) / sqrt (p0(1-p0)/n)
    n = number of checks, e = number of times check passes
    p0 = 0.9; basically if failures were random with prob 10%,
       how many standard errors away is (n,e) from expected
  Can augment with trustworthiness of code in which examples found
  Also note empty template slot might signal error
    E.g., if lock doesn't appear to be protecting any data
  Should system just drop errors with rank under some threshold t?
    Problem:  Would be very sensitive to value of t
  Instead, just rank them
    Programmer goes through in order of decreasing rank
    Stop when fraction of false positives is too high

What are latent specifications (sec 5.2)?
  Based on name of function (lock/unlock/alloc/free/release/spl/intr/...)
  Common idioms -> NULL pointer return to designate error
  "Executable specifications" -> assert, BUG, panic
  Common knowledge -> don't dereference NULL pointers

Example:  Can't dereference user-supplied pointers in the kernel
  Can only access them through special paranoid routines
  Self-contradiction in line of linux 2.4.4 kernel (p. 11):
      if (copy_to_user (rt, ipddp_find_route(rt), ...))
    First use of rt "taints" the variable -- MUST be user-supplied
    Second use of rt (in find_route) is uncheck -- MUST be trusted
  2 passes - global pass, computes taint/dereference/assignment
	     transitive closures
           - local checking pass
  Why not just check arguments of functions?  Why use second pass?
    Many drivers use arguments as integers or pointers

Example:  use after free -- don't want to use a pointer after freeing it
  Problem:  There are many "free-like" functions in the kernel--how to find?
  To find slot instances, find functions that take last use of pointer
    -- they MAY be free-like functions
  Then use statistical analysis

Another problem:  Finding too many candidate slot instances
    E.g., what pairs of functions to check for lock/unlock-like behavior
  LATENT SPECIFICATIONS can solve the problem (sec 5.2)
  E.g., Use substrings in function names:  lock, alloc, release, assert, ...

Ranking output - What to do if you aren't sure it's a bug
  Rank the possible bugs by likelihood that they are
  Then investigate in order, stop when too many false positives