Caching in the Sprite Network File System ========================================= Assumptions Memories will be large enough to hold entire working sets Server CPU is the bottleneck Structure of sprite Client - client cache -- Server cache - disk Address cache blocks by file/position Client doesn't know disk layout Client can still allocate blocks Server also caches physical metadata structures Write policy First write to client cache, then server cache, then disk Write to next stage when unmodified for 30 seconds Benefit of delay -- exploit overwrites and deletes How useful in practice? BSD study suggests 20-30% of files gone within 30 seconds Paper last time suggested many overwrites, but 30 seconds maybe too short What are most short-lived files? Speculation: many files in /tmp, so maybe just make /tmp local... Why not write on close (NFS/AFS)? BSD study says 75% of files open less than 1/2 second, 90% -> 10 sec Cache consistency What is sequential write sharing? Only one writer or multiple readers at a time Protocols to support sequential write sharing NFS - poll server for freshness on every open Disadvantage -- heavy load on server Disadvantage -- incur round trip on every open AFS - have server call you back Advantage -- less load on server, fewer round trips Disadvantage -- server must keep state (potentially large) Disadvantage -- what if client disappears? Leases - server promises callback for limited time What is concurrent write sharing? Multiple simultaneous accesses, at least one of which is a write How does Sprite achieve concurrent write sharing? Put file in one of several modes: Sequential write-sharing -- one writer, or some readers, not both After every modifying close, bump version number Version number tells other clients to flush cache Concurrent write sharing -- disable caching What happens when second open puts file in concurrent write sharing? Server keeps track of last writer Server asks writer to flush dirty blocks, and waits Then server lets open succeed Is implementation optimal? No. Even once writer done, readers still can't cache. "tail -f" would produce terrible performance... What about consistency of mmap? How could you make this work? Use paging, and have each writable page mapped on only one client p.9 "The cache consistency mechanism cannot guarantee that concurrent applications perform their reads and writes in a sensible order." What does this mean? Example? What happens if sprite server crashes? (sec 9) Uh oh, probably need to kill all your client processes This is why NFS wanted to be *stateless* What happens if a sprite client crashes? Might not write back needed dirty blocks What is the disk full problem? Client allocates blocks, but might not be free space Applications expect write to fail, not close Or with 30 sec delay, disk might fill after application exits