Winter 2003 Final Solutions

About This Exam

1-4, 9-11Mendel
5-8, 12-14John

If you believe that part of your exam was graded in error, we'd like you to directly contact the person who graded the question. Arithmetic errors should be e-mailed to the staff list and those exams can be brought to the Office Hours of any of the Winter CS140 staff.

a: Virtual circuits have the advantage because at the time the virtual circuit is established a particular quality of service guarantee may be reserved.
Datagram protocols do not have a setup phase, so they are unable to make those reservations.

b: Datagram-based protocols have the advantage because they are individually routed; thus the failure of a link or router will require no effort on the part of sender or receiver, but will be automatically routed around by the network.
With virtual circuits, it is necessary to tear down and recreate the connection in the face of a failure. They are unable to change routes on the fly.

c: We took both answers for this. On the one hand, a virtual circuit will get better use of a particular link because their packets have much smaller headers (just a circuit number, not a full address), so there's less overhead.
On the other hand, datagram protocols may be better if you take a whole-system view, because if there are several low-bandwidth links to the same destination a datagram-based protocol will send some packets over each of those links, while a virtual circuit will be confined to one. This mechanism of the datagram protocol is called statistical multiplexing.

6: TCP uses the sliding window mechanism to have multiple packets in flight at once. With the sliding window, the sender may have several packets in flight, rather than having to wait for an acknowledgement of each packet before sending the followup packet. See the lecture notes for a more full explanation of the sliding window.

7: IPv4 addresses are hierarchically arranged. This lets the system take advantage of spatial locality to reduce the number of routing entries a router needs. Instead of having an entry for every host, a router will have a single routing table entry for transport to many remote hosts.

8: Ethernet uses CSMA/CD with exponential backoff to handle contention. When a sender has a packet, they listen on the line until the line is free, then send. If they collide, they wait a random timeout in [0, n], the resend. If they collide again, the interval increases exponentially with subsequent collisions (2n, 4n, 8n, etc).

12: Without a write-ahead log, the file system must synchronously write out data to the file system, which will incur severe seek time penalties. With a write-ahead log, the only synchronous writes are to the log, and will be sequential writes (meaning little or no seek time), and hopefully on a separate disk altogether. Since that handles our consistency guarantees, the system is free to perform aggressive write-back caching.

13: The canonical case which obliterates shared FS/VM memory is reading sequentially through a big file. Unless it's explicitly protected against, this will cause the system to give over much of the physical memory to storing pages of that file, which will not even be used (since it's a sequential write, we don't reuse the old entries).
Note that just pointing out that a VM-heavy workload will take space away from the FS cache is not correct. In fact, this is what the system is designed to do: if you are using a lot of VM, you want to give that space to the VM manager (and the reverse). The problem with the scheme is that it can be exploited to make sure that neither the FS cache nor the VM system have good hit rates, because all the memory's being wasted caching a file we'll never use again.

a: Increasing the block size will improve sequential file performance. When the size of the block increases, spatial locality increases, and sequential reads take advantage of that. To read the same number of bytes will take fewer disk I/O's with bigger blocks, which means less overhead (seek time, rotation time, OS I/O processing overhead, etc).
b: Increasing the block size should somewhat improve random access performance as well, because you have to iterate across fewer pointers to find the desired block. Note that since FAT keeps the pointer list in the file allocation table at the head of the disk, which is usually cached, these are in-memory pointers not on-disk pointers. Alternatively, credit was given for pointing out that random writes are usually smaller than the disk size, so when you read the larger block size you would spend time reading extraneous data into memory, which would decrease performance.
c: Internal fragmentation increases markedly as the block size increases. On average, each file will waste 1/2 the space of its last block, so doubling the block size doubles that overhead. This could be further exacerbated if the average file size was significantly smaller than the block size, in which case each file would waste more than 1/2 a block.