PFS
===

What is the storage model for PFS?
  - Storage system (e.g., network-attached disk) managed by untrusted entity
  - Want to make sure any tampering with data gets detected
  - Want system to be adaptable to many file systems

Straw man 1:  MAC every block you store in the system
  - Size no longer a power of two (or updates not atomic)
  - Intruder can switch two blocks, X and Y, and may still look correct

Straw man 2:  Store hashes with block pointers (like SFSRO)
  - Makes updating more expensive
  - Indirect references to blocks (i-numbers) would need to be change, too

How does PFS solve the problem?
  - Have block map, which contains hash of every block on disk
  - For 16GB partition, 8KB blocks, 16-byte hashes -> 32MB
  - When you read a block, must check against entry in block map
  - When you write a block, must update map

How is map stored?  Stored on WAFS.
  Key idea:  Can merge file system journal with block map log

What is WAFS?
  - Simple file system designed to store file system journals
  - Supports limited operations:
      append - appends log entry to a file, returns LSN
      read - takes an LSN as an argument, returns log entry
  - Can be mounted synchronously or async (if want async creates, etc.)
  - WAFS can be on different disk from file system it is journaling (faster)

How do you trust contents of WAFS?
  - Can store on local disk
  - Log entries are all MACed, so untrusted storage OK, too

When you update a block...
  1. Update contents of block in buffer cache
  2. Flag buffer as not yet hashed
  3. hashd hashes block, writes result to head of journal
  4. Clear flag and set buffer header to contain LSN journal entry LSN
  5. Do not write block back until log has been flushed past LSN
  - Only 1 and 2 must happen synchronously with system call

How do you checkpoint the block map?
  - Can't write whole thing to disk atomically, way too big
  - Use partial checkpoints:
      Break map into chunks, checkpoint chunks in round-robin fashion

What happens if you crash after writing log entry, but before writing block?
  Danger:  Could have new hash value in block map, old data in block
  Solution:  Each checkpoint must be followed by "async map"
    Async map contains old values of blocks not yet written

How to recover block contents after a crash?
  - Some data exists on disk--but must verify its integrity
  - Log will contain some number of hash values for any particular block
  - Blocks could actually be any of those hash values
  - Solution:  Store two LSNs in buffer header:
      buffer-end:  The LSN you must write before writing the block)
      buffer-begin:  The LSN of the last time the block was written
  - hashd logs the oldest buffer-begin LSN of any dirty block
  - Thus, do not believe any hashes in log regords before than LSN

Attacks on the system?  Freshness.