Caching and Consistency
=======================

Problem is ensuring network file system clients see the same data

Two interesting cases
Sequential write sharing:
  Only one writer or multiple readers at a time
  How do NFS and AFS offer consistency under sequential write sharing?
      NFS - poll server for freshness on every open
        Disadvantage -- heavy load on server
        Disadvantage -- incur round trip on every open
      AFS - have server call you back
        Advantage -- less load on server, fewer round trips
        Disadvantage -- server must keep state (potentially large)
	Disadvantage -- what if client disappears?
      Another mechanism:  Leases - server promises callback for limited time
        Advantages -- same as AFS callbacks
        Advantages -- plus, can recover from dead client
        Disadvantages -- Have to wait lease time to recover from dead
		      	 client or server reboot (since leases in memory)
	Lease implementation issues:
	  Lease may include expiration time
	    Valid until ExpTime - MaxClockSkew
	      so requires roughly synchronized clocks
	  Or can be relative:  "N seconds from now"
	    So conservitive estimate of expiration is
	      RPC_Issue_Time + N
Concurrent write sharing:
  Multiple simultaneous accesses, at least one of which is a write
  How do NFS and AFS provide consistency under concurrent write sharing?
      Short answer:  They don't, really
        NFS - changes asynchronously written to server
	  Even when writes not on server disk (COMMITted)
	      other clients will see changes - unless maybe server crashed
	  So you don't have strict consitency
	    Say you read immediately after my write returns
	      The write is asynchronous (before close) so you might miss it
	      If you are caching the data block, you won't check with server
	  What about mmap?  (Forget it)
        AFS - even worse that NFS
	  Writes are not visible to other clients until you close the file
	  If two clients write a file, the last one to close it wins

How to achieve concurrent write sharing?
Example:  Sprite file system (will come up again when we read Zebra)
  Motivation - study of BSD file usage
    75% of files open less than 1/2 second, 90% -> 10 sec
    20-30% of files gone within 30 seconds
    So wanted to avoid writing file through to server on close
  Sprite write policy
    Write first go just to client cache, then server cache, then disk
      Write to next stage when unmodified for 30 seconds
    Benefit of delay -- exploit overwrites and deletes
  How useful in practice?
      Speculation: many files in /tmp, so maybe just make /tmp local...
  Sprite consistency -- Put each file in one of several modes:
    Sequential write-sharing
      After every modifying close, bump version number
      Version number tells other clients to flush cache
    Concurrent write sharing -- disable caching
    What happens when second open puts file in concurrent write sharing?
      Server keeps track of last writer
      Server asks writer to flush dirty blocks, and waits
      Then server lets open succeed

Sprite issues relevant to today's paper
  When a Sprite server crashed, lost information about open files
    In initial implementation, needed to kill all your client processes
    Later server could ask clients what files they had open
      But potentially slow recovery if clients not responding
    (This is why NFS wanted to be *stateless*)
  When a sprite client crashed might not write back needed dirty blocks
  The "disk full" problem
    Client allocates blocks, but there might not be free space
    Applications expect write system call to fail, not close
    Or with Sprite's 30 sec delay, disk might fill after application exits!

What about consistency of mmap?  How could you make this work?
  Use paging, and have each writable page mapped on only one client