CS 140 Summer 2005 Final Solutions

Statistics

For this exam, the mean was 65, the median was 67, and the standard deviation 15.

Retrieving Exams

Exams may be picked up in Ben's office. Please call ahead at 650 723-3654 to make sure he'll be in.

SCPD Students

If you're a SCPD student, then we'll send it back to you through the SCPD if you attached an SCPD routing slip to your exam. Otherwise we'll assume that you want to pick it up in person. You can email the staff list if you change your mind.

Grading Errors

If you believe that part of your exam was graded in error, please request a regrade. You should preferably attend the office hours of the person who graded the question (shown below), or as a second choice email the staff list. Arithmetic errors should go to the staff list or the office hours of any of the staff.

Problem Grader

1, 2, 3, 4, 5 Kevin

6, 8, 9, 11 Ben

7, 10 not graded (see below)

Problem	Grader
1, 2, 3, 4, 5	Kevin
6, 8, 9, 11	Ben
7, 10	not graded (see below)

Questions 7 and 10 were thrown out and not graded. See the specific reasoning for each question below.

Problem 1

Question

Consider a multitasking OS for a symmetric multiprocessing platform. In each of the following situations, state whether you would prefer to use a spinlock or a semaphore for synchronization, and why. If you believe that neither is an acceptable choice, explain.

(a) Incrementing a hit counter for a web server.

(b) After a disk read command has been delivered to a disk device, waiting for the sector read to complete.

Answer

Vocabulary: A SPINLOCK is where a CPU spins (runs no-op instructions) for a very short period of time before retesting the lock; the CPU does not yield the CPU. Spinlocks are most commonly used in interrupt handlers where blocking is not an option and, if a lock is held, another CPU will be releasing it in very short order.

(a) A spinlock is ideal, because the lock should be available again very soon and the overhead of blocking on a semaphore is much longer than the expected time until lock acquisition.
Answers suggesting per-thread counters that are merged periodically / at termination - or other VALID no-synchronization-required strategies - were also acceptable.

(b) A semaphore is ideal, because a spinlock will not yield the CPU until the disk I/O is complete and therefore will waste ~5ms (avg) CPU time.

Problem 2

Question

In a Unix-like file system, is the file name stored in a directory or an inode? What file system feature (or features) makes this choice more reasonable than the other?

Answer

File names are stored in directories in Unix-like filesystems. Reasons I accepted include:

To implement hard-links; each link has its own filename
Cheaper to search for filename in directory because less blocks accessed (dir data instead of O(n) inodes)
Cheaper to follow a path, because only inodes along the path are accessed
Any legitimate reason that points to O(N) less disk accesses. (Saving inode space is O(1), which is not enough)

Problem 3

Question

Old Unix-like systems did not provide any primitives intended for synchronization of user processes, but they did provide pipes. A pipe is a channel for transmitting a stream of data, based on a fixed-sized buffer (often 4 kB). Reading an empty pipe blocks until data is written. Writing to a pipe whose buffer is full blocks until data is read.

(a) Explain how to implement a semaphore using a pipe. (Hint: take advantage of the pipe's blocking semantics.)

(b) A semaphore implemented as a pipe may deadlock in a situation where a conventionally implemented semaphore would not. Describe the situation.

Answer

(a) Semaphore up = write 1 byte, Semaphore down = read 1 byte. I also accepted write/read 4K and vice versa (up=read 4K, down=write 4K) because, though less efficient, they also have semaphore semantics. Note that sizes were important - if you just mentioned up=write and down=read, I gave half-credit.

(b) Semaphore UP can block. A valid example includes >4096 up calls before any down calls; this deadlocks. IF your semaphore implenentation used 4K blocks, I accepted two successive calls for the example.

Problem 4

Question

fd is an open, writable file descriptor for a zero-length file in a Unix-like file system. Its file pointer is set to an offset of 10⁹ bytes. What would each block written by the following function call contain?

write(fd, &data, 1);

Answer

Data itself (1 byte) - 2pts
Zero-filled blocks for first 10⁹ bytes OR mention of sparse files - 1pt
Mention of ANY metadata - 2pts (this was the point of the question!)
indirect / doubly-indirect blocks or equivalent index structure - 2pts
inode written back - 2 pts
freemap (or other filesystem metadata) - 1pt

Problem 5

Question

Professor Bob designs a file system that allocates file data using a variable-length array of extents. Suggest two advantages and two disadvantages of this scheme compared to the data allocation schemes discussed in class.

Answer

Here are the answers I accepted.

Advantages:

Less metadata to store
Faster sequential access
Easy to extend to large disk sizes
Pointer strategy reduces external fragmnetation
Internal fragmentation easy to reduce

Disadvanteges:

Can suffer from severe fragmentation
Hard to grow files (e.g. complex freelist)
Need to relocate parts of the files / metadata during growth
Pointers can be inefficient if extents are too small
Seeks to find extents can be inefficient

Problem 6

Question

One-time passwords.

(a) Are one-time passwords vulnerable to replay attacks when used for authentication on physical machines? On virtual machines?

(b) Are one-time passwords vulnerable to man-in-the-middle attacks on physical machines? On virtual machines?

Answer

(a) Given proper implementation, one-time passwords are not vulnerable to replay attacks on physical machines. Replaying the password will not authenticate to the machine, because a different password is required each time.

One-time passwords on virtual machines may be vulnerable to replay attacks even in situation where physical machines are not. Consider checkpointing a virtual machine, then resuming from the checkpoint and authenticating via a one-time password. If the one-time password state is stored in the checkpoint, then resuming from the same checkpoint a second time and authenticating with the same password is possible.

(b) Some students answered this question from the viewpoint of local authentication, others from the viewpoint of network authentication. A variety of answers were acceptable. In short, one-time passwords do not protect against man-in-the-middle attacks; it is not what they are designed to do. Some students mentioned other mechanisms that can protect against them, such as secure attention keys. Interposing a virtual machine monitor has little effect, except to add another layer that can be subverted.

Problem 7

Question

Why can salts be made public?

Answer

This question was thrown out and not graded, because very few correct answers were seen and I suspected that the correct answers were from knowledge learned outside the class.

A salt is a random number hashed along with a password. The salt value is usually represented in cleartext alongside the hashed password. The purpose of the salt is to prevent dictionary attacks, that is, hashing every possible password (or the most common passwords) so that any hashed password can simply be looked up in the list. Salting also keeps it from being obvious when more than one user has the same password. Neither of these purposes is subverted if the salt value is made public.

Problem 8

Question

Alice and Bob have previously exchanged public keys. Now Alice wants to send a confidential message to Bob.

(a) What should Alice do? Fill in each blank with one of the following words: signs, encrypts, Alice, Bob, public, private.

Alice ______ the message with ______'s ______ key
then ______ the result with ______'s ______ key.

(b) Could Alice reasonably perform these operations in the opposite order? Why or why not?

Answer

(a) Alice encrypts the message with Bob's public key, then signs the result with Alice's private key.

Reversing the order of operations was also acceptable.

(b) There are advantages and disadvantages to each order of operations, but neither order is egregiously incorrect. Encrypting before signing means that both the sender and the recipient can be identified relatively easily, but it also means that the recipient (and anyone else who observes the message) need not waste time attempting decryption if the signature is incorrect. Signing before encrypting means that only the recipient can be easily identified by third parties (and the recipient), but the recipient might also waste time decrypting only to find out that the signature is invalid.

Many answers were acceptable here, including some answers that said that the order could not reasonably be reversed. I read the answers for an indication that the student had an understanding of public-key encryption concepts and the ability to apply them to the situation.

Problem 9

Question

An architecture supports 4 kB pages and 4 MB superpages. A process uses 4 MB of virtual memory. Give three reasons why many 4 kB pages might be a better way to map the region than one 4 MB superpage. Assume that the process's memory is properly aligned for a superpage.

Answer

Many possible correct answers here. A few of them:

You didn't say that the memory was contiguous virtually. If it isn't, then you have to waste virtual and physical memory.
The process is probably not going to need all that memory at once. It makes sense to be able to page out small parts of it at a time.
Parts of the memory might be shared among multiple processes. That's not possible with a single large page.
It's more expensive to write a large chunk of memory to disk (for swapping), or to read it from disk (for demand paging) than a small chunk. (As long as it's contiguous on disk it's not much more expensive though.)
Part of the page might be read-only and the rest read-write, etc.

Problem 10

Question

We wish to increase the size of a RAID array by adding one new disk. Is it easier to add the new disk if the array's RAID level is 4 or 5, and why? You may assume the new disk contains all zeroes.

Answer

This question was thrown out and not graded, because I decided that I had a bad idea of the expected answer.

Expected answer: In RAID 4 suppose we have 4 disks, set up so that A ^ B ^ C = P, with P the parity disk. Adding another all-zeros disk does not require changes to any other disk for parity purposes, because A ^ B ^ C ^ 0 = P. On the other hand, in RAID 5 the parity is distributed so we would need to move parity blocks to the new disk.

However, my expected answer didn't take into account the actual data, which is striped across all disks for both RAID 4 and 5. In either case, we'd either need to adjust the striping function in some non-obvious way, or move data around. It's difficult to tell whether students realized this (and thus rejected the expected answer above) or not, so I threw out the question.

Problem 11

Question

Networking.

(a) Wired links, such as Ethernet, FDDI, and ATM, do not normally feature encrypted or reliable transmission. What is the justification for this design decision?

(b) Many wireless links, including all variants of 802.11, do have encryption and reliability features. In each case, why make a different design decision for wireless?

Answer

(a) The end-to-end argument, that is, when if a network service is not needed by every layer above the layer in question, then it should not be implemented in that layer.

(b) Encryption: In wireless, every packet is broadcast to the world at large. The threat model is therefore different from wired networking, where we can assume some degree of physical security due to the need to physically tap a wire. (In fact, the name of the WEP protocol for encrypted wireless communication stands for Wired Equivalent Protocol.)

Reliability: Wireless communication is much more lossy than wired. A wired link would typically have less than 1% packet loss due to errors (e.g. on this machine there has been only 1 error in the last half GB of traffic), but wireless packet loss might be much higher, say half of packets. All higher level protocols already have to deal with lossy links, so this would not cause correctness problems, but it would cause performance problems. In short, adding reliability features to wireless protocols is a performance optimization.

cs140-sum0405-staff@lists.stanford.edu