Bugs as Deviant Behavior ======================== Background: Metal language Allows you to load arbitrary code into the compiler Can be used to check correctness rules E.g., free (ent); ... return ent; -> compiler reports using ent after free Premise of paper is that programs often violate correctness rules - Never call blocking functions in bottom half of the kernel - Check NULL pointers returned from routines - Do not allocate large variables on the kernel stack - Do not make inconsistent assumptions about whether pointer is NULL - Check array/loop bounds derived from user data - Release acquired locks, and do not double-acquire - Restore disabled interrupts - Do not use freed memory - Do not use floating point in the kernel - Check realloc result so as not to leak memory - Do not dereference untrusted user pointers - Allocate enough memory for type that is stored there In previous work, found many violations of correctness rules Problem: How to find such rules if you don't understand the system Real programs have 100s of unspecified rules Idea: Try to infer the rules from the source code itself Why should this work? Intuition: What if a detective interrogates witnesses? Doesn't know what actually happened But if two people contradict each other, one must be lying Also if 100 people say X and 1 person doesn't, probably the 1 is incorrect Same with code (p. 3 of paper): struct mxser_struct *info = tty->driver_data; // MUST: tty not NULL //... if (!tty || ...) ... // Here tty may or may not be NULL In general can infer beliefs: x = *p; // MUST: programmer believes p is not NULL // MAY: x may be protected by l free (p); // MUST: p is heap allocated // MUST: p not used any more unlock (l); // MUST: l acquired One technique: Statistical analysis Find lots of repeated idioms: lock(a); x++; unlock(a); ... lock(a); x++; unlock(a); ... Then an unprotected x++ means MUST: x not protected by l If x almost always protected by lock(a), unprotected access probably a bug (MAY belief conflicts with MUST belief) Another technique: Internal consistenty Check for self-contradiction if (card==NULL) printk (KERN_ERR, "capidrv-%d: ... %d!\n", card->contrnr, id) Example: Can't dereference user-supplied pointers in the kernel Can only access them through special paranoid routines Self-contradiction in line of linux 2.4.4 kernel (p. 11): if (copy_to_user (rt, ipddp_find_route(rt), ...)) First use of rt "taints" the variable -- MUST be user-supplied Second use of rt (in find_route) is uncheck -- MUST be trusted General approach for infering bugs: User supplies some TEPLATES Cannot use a pointer after passed to function must be paired with , , and are SLOTS Actual code that could fit in is called a SLOT INSTANCE Example: use after free -- don't want to use a pointer after freeing it Problem: There are many "free-like" functions in the kernel--how to find? To find slot instances, find functions that take last use of pointer -- they MAY be free-like functions Then use statistical analysis Another problem: Finding too candidate slot instances E.g., what pairs of functions to check for lock/unlock-like behavior LATENT SPECIFICATIONS can solve the problem (sec 5.2) E.g., Use substrings in function names: lock, alloc, release, assert, ... Ranking output - What to do if you aren't sure it's a bug Rank the possible bugs by likelihood that they are Then investigate in order, stop when too many false positives