Hails: Protecting Data Privacy in Untrusted Web Applications ============================================================= What is the motivation for this work? What is a web platform? How does Hails differ from other web frameworks? Policies specified alongside schema Mandatory access control (MAC) Examples of common problems: Paymaxx vulnerability--let people step through DB with key in URL What is a mass assignment vulnerability? What's an MP? "Model and policy" code Owns a database Provides a library through which VCs (and other MPs) can access database Determines security policy for information stored in the database What's a VC? "View and controller" code Basically an HTTP server Gets HTTP requests, interacts with MPs, and formats results What are principals? Users Remote web sites VCs (@nameofVC) MPs (_nameOfMP) What is the DC label lattice? Label is two positive (i.e., no negation) boolean formulas S says who can read data (e.g., alice \/ bob) I says who can write or may have written the data Check by logical implication: p => S means p can read, p => I means p can write Example: bob can read because bob => alice \/ bob but bob cannot write data with such a label, bob =/> alice ) [= iff S2 => S1 and I1 => I2. Intuition: Say principal p can read have p => S2 => S1, meaning p => S1, so p could read In other words moving data along [= increase who can read it Say principal p can write have p => I1 => I2 so can't be written by anyone not allowed to write Moving data long [= doesn't allow unauthorized writes to creep in What are privileges? What database structure (disregarding IFC)? Basically MongoDB A *database* contains *collections* of *documents* Collections are basically analogous to SQL tables Documents (analogous to rows) are JSON objects i.e., string -> value maps, each string:value pair called a *field* Unlike SQL: no fixed schema, no joins Some fields are *keys*, which are indexed within collection Remaining (non-indexed) fields referred to as *elements* Let's explore straw-man labeling schemes. What's wrong with... Giving each collection fixed set of fields with fixed labels? Different documents should have different labels for same field E.g., imagine alice and bob each represented by document in database alice should be able to update alice's address and bob bob's Storing an explicit label for each value in the database? I.e., document is string -> (label,value) map, label protecting value But now how is label set? Error prone if logic strewn throughout code What is the label on a label? (E.g., who can see or change it) How can you evolve the policy? Must update every doc in database? How are labels actually specified in the database? A database has a static label Coarse-grained policy of who can create collections, access DB at all Each collection has a static label Similarly coarse-grained protection If you can read data with collection label, you can see all the keys Provides unambiguous label rules for access to database index But also means key fields cannot have per-document labels Each document has a label which is a function of keys (and public elements) E.g., let's you specify that user field determines who can write doc Some elements have additional policies as function of keys (and public elems) E.g., only user, friends, and MP can read email address field Look at Figure 3 for example How does writing to a database work? Two possibilities 1) write pre-labeled data Use privileges or taint to pack individually labeled values into document 2) have database automatically label data In this case, must supply sufficient privileges for all labels to be set How does an MP use its privileges? (_MP) Convert post to doc while preserving integrity What is the win from labeled web posts (p. 10 "Trustworthy user input")? How does a VC use its privileges? (@VC) Advantages/disadvantages of privilege delegation + May be required by your application structure - Means VC should be defensively coded where it uses privileges - Means people aren't free to improve/replace VCs How would you add a new wiki to GitStar? Why does Hails have OS-level confinement, and what is this? Why does Hails require a browser extension, and what does it do? What evaluation questions should we ask? What is the performance cost? Does the system provide better security? Is the system actually usable?