CS140 Autumn 99-00 Final Solutions

1. Short attention span questions

  1. No, the system could be thrashing VM. In this case, there's lots of I/O, but very little CPU activity
  2. Smaller blocks are less likely to be written to, and when they do get dirty, there is less that needs to be written back. Consider how writing once to a 512 K block is different from writing to one of two 256 K blocks.
  3. There will be greater overhead because extent-based systems use 2 words (pointer and size) to identify data, as opposed to 1 pointer for FFS. When filesystems are full, there are fewer contiguous blocks, and hence more extents need to be mapped.
  4. (This isn't the greatest answer...) Hierarchies imply a level of indirection, which results in a speed hit. Also, in some hierarchies, there can be a single point of failure
  5. Library, but don't worry about this question
  6. Using copy-on-write for thread state
  7. Sequence numbers are used to filter out old packets
  8. Caches can help reliability in this case because we can use our local copy instead of going to the DNS server. If the DNS server went down, we wouldn't be affected for the cached entries.

2. Memory Allocation

No, she isn't completely correct. Free also serves a third purpose, which is to compact memory usage. For applications that do many malloc/frees within their lifetimes, we can reuse the recently deallocated space. Without compaction, our address spaces can become very fragmented, which leads to bad VM behavior.

3. Concurrency fun

No, this code is actually safe. Consider the various scenarios--being interrupted by a push, a pop, within each of the locks, between the locks, and at the begining and end of each function. It appears dangerous because head_lock and count_lock are seperate--and in fact it can lead to an incorrect value for count. However, it errs conservatively, and there's not actually a race condition, so it's okay.

4. Stupid Data Structure tricks

  1. Replacing fs meta data with "file tables"
    1. Worse--because the table is fixed size, and must accomodate the largest possible file size
    2. Better--if most of the blocks are used, then we have eliminated the need for indirect blocks
    3. Worse--file info like size is spread out (in between each big file table, instead of clustered together within directories
    4. Better--given the file size, just jump to the end of the table
    5. Wosre--Since files are variable sized and these tables are fixed sized--we'll start running into problems. This should be a familiar theme!

5. Sort of Logging

1. for mv foo/a bar/b
(1) link a to b, update directory entry.
(2) flush bar directory metadata.
(3) unlink foo/a, update directory entry.
(4) flush foo directory
(5) wipe log.

Upon a crash...
if exists a, not exists b => restart @ (1)
if exists a, exists b => restart @ (3)
if not exists a, exists b => ok, restart @ (5)

2. The problem with an unbounded log is the lack of idempotency.
Suppose we allow this sequence in our log
mv a b
mv d a
cp q d
rm b
crash here
We'll restart and it will appear to be in our original state. Unfortunately, while the names are the same the data is not.

Faking What you don't got

fault_handler(u32 badva, int write, struct seg *st, int nsegs) {
  u32 segnum = badva >> 16;  // get seg num from va

  u32 faultaddr;
  for (int i=0; i<nsegs;i++) {  // look for the segnumber in the segtable
	if ( st[i].segnum == segnum ) { // match

	  u32 offset = 0xFFFF & badva;  // get the offset

      // if we're out of bounds in this seg
	  if ( offset >= st[i].nbytes )

	  // *** check permissions ***
	  // assuming 1 = write, 0 = read, only 2 values
	  // check write permission
	  if ( write == 1 && !(st[i].prot & 0x2)) {
	  // check read permission
	  else if ( !(st[i].prot & 0x1) ) {

	  // okay, we have a valid physical address now
	  faultaddr = st[i].base + offset; // calculate the real address
  if ( i == numsegs) // didn't find a tag

  return faultaddr;

Last modified: Sat Mar 4 14:17:42 PST 2000