# CS 140 Summer 2005 Midterm Solutions

## Statistics

For this exam, the mean was 35.9, the median was 37, and the standard deviation 11.1.

## Retrieving Exams

Exams may be picked up in Ben's office hours or after class.

### SCPD Students

If you're a SCPD student, then we'll send it back to you through the SCPD if you attached an SCPD routing slip to your exam. Otherwise we'll assume that you want to pick it up in person. You can email the staff list if you change your mind.

If you believe that part of your exam was graded in error, please request a regrade. You should preferably attend the office hours of the person who graded the question (shown below), or as a second choice email the staff list. Arithmetic errors should go to the staff list or the office hours of any of the staff.

The points for questions 1, 2, and 4 were mismarked on the cover sheet of each exam. We corrected these during grading.

1, 2, 3Ben
4, 5, 6Kevin

## Problem 1

### Question

Condition Variables:

(a) (3 points) To be semantically correct, the `wait`() operation on a condition variable must execute three operations. Describe, in 10 words or less, each of these operations in their order of execution.

(b) (3 points) Which of the above operations must execute atomically? Describe a race condition that would result if these steps were interrupted.

(c) (5 points) The value of a semaphore records a history of how many times P and V operations have completed, but a condition variable does not have history. How, then, does the user of a monitor avoid missing an important event?

#### (a)

1. Release the monitor lock.
2. Block until signaled.
3. Reacquire monitor lock.

Many students reversed steps 1 and 3. This is a serious misconception, because code executing in a monitor always holds the monitor lock except while it is waiting in `wait`(). This cost 2 points.

Some students added in more detail as extra steps, such as add this thread to a queue of waiting threads. This didn't cost points in itself, but most of these students made other mistakes as well.

#### (b)

Steps 1 and 2 must execute atomically. Otherwise, if the scheduler intervened, another thread could acquire the monitor lock and send a signal (that would be lost) before the thread could start blocking.

Some students stated that individual steps must be atomic. This is true, but not interesting: of course operations such as releasing a lock must happen atomically. (Students who broke step 2 into add thread to queue and block until signaled very reasonably often stated that these two operations must be atomic.)

A few students thought that steps 2 and 3 must execute atomically. I can see how Hoare semantics (though rarely implemented) would require receiving a signal to be atomic with reacquiring the monitor lock, so this was worth 2 points by itself.

#### (c)

A monitor has a monitor lock that must be held by any code running in the monitor. This ensures that only a single thread can be running in the monitor at a time. Thus, code that needs to wait for a condition can be sure that the condition's status cannot change, and the event cannot be signaled, while it is checking for it. Because of the atomicity of steps 1 and 2 in part (a), above, it can also be certain that the event cannot be signaled between the time it releases the lock and the time it starts blocking.

Some students stated that the monitor could include a counter to track history. This is accurate, but it misses the crux of the question, which is about missing an event, not counting. Depending on how well it was stated, this answer was worth up to 3 points.

Some students cited the loop used for condition variables with Mesa semantics, e.g. `while(!condition) wait(condition);`. This is incorrect, because the loop does not prevent missing events. Rather, it makes sure that the condition that was signaled is still true (not stolen by a thread scheduled in between unblocking and waking up).

## Problem 2

### Question

The DeathStation 8000 is a uniprocessor machine designed from the ground up to support cooperative multitasking under DeathOS, a multiprocessing, multithreaded OS. The DeathStation 9000, now in development, will support preemptive multitasking. This obviously requires changes in DeathOS. Answer the following questions about other changes that are also necessary.

(a) (5 points) What changes might the upgrade require in application software, to keep it from malfunctioning in this new environment? Consider how the assumptions that processes and threads may make change between cooperative and preemptive multitasking environments.

(b) (4 points) What additional hardware, or modifications to hardware, might the upgrade require?

#### (a)

Threads and processes in a cooperative multitasking environment have little need for explicit synchronization. Every sequence of code between calls to `yield`() (or other functions that block) is implicitly a critical section. If preemptive multitasking is introduced, applications must be modified to use explicit synchronization.

This question is explicitly about applications. Some students instead described ways that the OS would have to be modified. This was worth a few points at most.

#### (b)

A system with preemptive multitasking needs a periodic timer interrupt to enforce time slicing.

Some students suggested that a cooperative multitasking environment has no need for hardware interrupts at all. This is not generally true—hardware interrupts are used for much more than time slicing.

The DS9000 is still a uniprocessor machine, so atomic compare & swap instructions, etc., are not needed for synchronization, but some students thought so.

## Problem 3

### Question

Each of the events listed below interrupts the flow of execution. Answer the following questions about each event:

• Is the event synchronous or asynchronous?
• What state must be saved and restored?
• Who (caller, callee, etc.) must save and restore this state?

Your answer should discuss generic components of machines (e.g. the instruction pointer), instead of being oriented toward any particular machine architecture (e.g. EBX in the 80x86 SVR4 ABI).

(a) (5 points) Procedure call

(b) (5 points) System call

(c) (5 points) Process switch caused by a time slice expiring

#### (a)

Procedure calls are synchronous. The caller saves the instruction pointer and the callee restores it (possibly with hardware assistance). The callee and the callee cooperate to save and restore the stack pointer, and the caller or the callee must agree on who saves other registers, but details vary.

#### (b)

The transition between user and kernel privilege level needs hardware assistance, and there may be an address space change, but otherwise this is the same as (a).

#### (c)

Timer interrupts are asynchronous. The instruction pointer, stack pointer, and other registers, as well as any other process-visible state such as the current address space, must be saved by the kernel on behalf of the old process/thread and restored by the kernel on behalf of the new process/thread.

There is no caller or callee here, just old and new processes or threads. I tried to guess what students meant by these terms but might not have always succeeded.

A few students said that the interrupt level needed to be saved. That's not really true, because interrupts must be enabled if a timer interrupt was invoked.

## Problem 4

### Question

(6 points) Some coding standards for applications require that every block obtained by a memory allocation function (e.g. `malloc`()) must be released with a corresponding deallocation function (e.g. `free`()) before the process terminates. Explain why this practice may cause additional work for the OS's virtual memory subsystem and thus slow down the system.

The major source of additional work is if the OS's virtual memory system must page in (from disk) pages that have not been used in a long time. The pages would be brough in only to have the memory on that page released, a wasteful process. Only this answer received full credit.

Some answer suggested other inefficiencies of releasing memory, such as the need to modify page tables or other kernel data structures. While true, these are not a significant source of overhead when compared to bringing pages in from swap, so these answers received partial credit.

Many exams pointed to bookkeeping work performed by free(), e.g. the maintenence of free lists and coalescing blocks. malloc() and free() are part of the C library, and NOT system calls (though internally they use the sbrk() syscall discussed in lecture). Because the question asked for work the OS performs, these answers received minimal credit. Fragmentation is only a problem if the process frees memory then allocates more memory; since the question asks about terminating processes, fragmentation is not relevant.

An operating system naturally reclaims of a process's virtual memory pages when the process terminates; a few answers said not freeing would cause a memory leak, which is incorrect.

## Problem 5

### Question

Ben's allocator uses a linked list to track free blocks. In the allocator, all blocks are a multiple of 4 bytes in length. Answer the following questions assuming that the free list contains just two blocks, of 28 and 16 bytes each, in that order. Ignore any space required by bookkeeping overhead.

(a) (3 points) List a sequence of `malloc`() calls that would succeed given first-fit allocation, but fail with best fit.

(b) (3 points) List a sequence that would succeed with best fit but fail with first fit.

(c) (3 points) List a sequence that would succeed with worst fit but fail with best fit.

Sample patterns that work:

#### (a)

malloc(14) malloc(14) malloc(16)

#### (b)

malloc(14) malloc(20)

#### (c)

malloc(14) malloc(16) malloc(14)

The answers are, of course, not unique. Explanations were not necessary. Any sequence that correctly succeeds/fails received the full 3 pts. Sequences that did not cause the correct success and fail did not receive credit.

## Problem 6

### Question

Segmentation & Paging:

(a) (5 points) The IBM 370 and Intel 80x86 architectures implement segmentation and paging at the same time, in very different ways, but both architectures perform segmentation before paging. Would it ever make sense to perform paging before segmentation? If so, outline a hypothetical MMU architecture of that form and the rationale behind it. If not, explain why not.

(b) (5 points) Most architectures designed recently do not support segmentation. Can segmentation be simulated with paging? Describe how so. Specify your assumptions.

#### (a)

In general, segmentation after paging does not work. The major benefit of paging is the elimination of memory fragmentation (because all pages are the same size). Adding segmentation after paging reintroduces the fragmentation problem by requiring the physical pages (after address translation) to be contiguous. Therefore, answers which addressed the fragmentation problem received full credit. Some answers did not quite make all the connections; these received partial credit.

Observing that adding segmentation introduces another layer of translation and thus affects performance is incorrect. If a hardware architecture can perform segmentation then paging, it can perform paging then segmentation just as cheaply.

I accepted a novel ideas for credit: using the segmentation layer for protection instead of address translation. Any novel idea had to be both useful and practical to receive credit.

#### (b) (general discussion)

The most direct way to simulate segmentation is to assign the higher-order bits of an address space to be a segment number. The page table is populated with pages at a base (marked by the higher-order bits, e.g. 0x1234 0000 is the base) and filled with enough pages to represent the bounds.

In general, it is not possible to fully simulate segmentation with paging because paging can only handle sizes of exact multiples of the page size.

It is possible to modify a page fault handler to make arbitrary bounds on a given page (at every memory access, the handler checks the bounds in software). This approach allows full segmentation, but the worst case performance is a page fault on every instruction and so it is generally not used.