G22.3250 Final project
Overview
In this lab you will define your and execute your own project. The
lab is structured in several parts:
- Project proposal. The proposal is a short (maximum of two
pages) proposal for what your project will be. It should state
what problem you are solving, why you are solving it, what
software you will write, and what the expected results will
be. You won't be judged on your proposal; it is there to help you
to get started.
- Software demo. You will execute the project and
demonstrate the software to me.
- Presentation. You will present your work to the class in a
talk on the last day of classes.
- Project paper. The paper is a maximum of 12 pages in at
least 11 point font, describing the problem you solved and the
solution, and reporting how effective that solution was. The
paper should have a serious evaluation section, similar to the
ones that you have seen in the research papers you are reading in
class.
Your grade will primarily be based on the paper.
Doing a good project is a daunting task. In general, it is better to
tackle a precise small problem and do a good job evaluating it than to
tackle a large problem and get lost in the scope of the problem. To
help you to define a project, this page lists several suggestions.
You should keep me informed at all stages of your project. Please
come talk to me about your idea for a project, how you should execute
the project, what resources you need for the project, what you should
write about in your final paper, etc.
The project should be executed in teams of 2-4 students. The larger
the team, the more ambitious your project should be. When you have
formed a team, send me email to set up a meeting and discuss your
proposed project.
Project suggestions
If you are having trouble thinking of a project idea, some of the
ideas below might help get you started. You might also want to look
through recent papers in SOSP,
the top OS conference, as well as OSDI and Usenix (both of which are
available as links off of this
page). Here are some other ideas:
- Design a disk layout for a file system and implement it as a
network file server (the SFS protocol is very much like NFS).
Make sure your layout and update algorithms have good crash
recovery properties.
- Design a peer-to-peer backup system. A number of peer-to-peer
distributed hash tables, such as chord, can scale to
millions of separately administered nodes and survive high failure
rates. By backing up data to such a system, one can achieve
off-site backup by exchanging storage with other sites.
- Currently, the only way to disseminate streaming media to many
users is to pay for lots of bandwidth. A more democratic
alternative would be for users interested in a data--for instance
streaming audio--to donate bandwidth to help disseminate the file
further. Design a self-organizing system that can stream audio to
millions of nodes from behind a relatively low-bandwidth network
such as a cable modem. The source and each node in the system can
transmit the data to two other nodes, thus reaching an exponential
number of nodes in total.
- Implement the world's fastest web proxy with persistent caching.
Use NFS as an asynchronous interface to the file system.
- Implement a threaded web proxy and analyze the performance.
Compare threaded vs. asynchronous performance.
- Implement a system like Network
Objects for C++.
- Build a mail proxy that rejects spam. It's easy to get spam by
posting to newsgroups. Post fake articles from fake email
addresses, and record any mail sent to those addresses as spam.
Then reject any mail messages almost identical to the spam.
- Make a distributed shared memory (DSM) system, so that processes
running on different machines can share an address space. Plan to
allow caching but maintain consistency. You would also want to
find at least one program that could take good advantage of DSM,
to help you evaluate your system.
- Implement a system for dynamic caching of data on untrusted
machines. The SFSRO read-only file system digitally signs data so
clients needn't trust the server. This would allow clients to get
data from each other without trusting each other. However a
mechanism is required for clients to find out about and transfer
data from each other.
- Build a file system to something else gateway (e.g. ftp, database,
web).
- The eraser
tool finds data races in threaded programs. It uses link-time
code modification to rewrite executables, and thus is not
portable. Implement a portable race detector using C++ templates
(ptr, ref, etc.) to detect when memory is referenced.
- Propose a new buffer management strategy and instrument a file
system to evaluate it.
- Design and evaluate a scheme for dynamically organizing data on
disk to optimize reads. For example, many web pages contain
several images. Disk throughput would be improved if a web page
and all its images were near each other on disk. This could be
achieved by reorganizing disk layout based on reads (particularly
if there are more disks than the system needs storage). Design
such a disk-reorganization scheme and evaluate it using traces
from a web server.
The Paper
This section provides some suggestions and guidelines on writing style
and some of the things the grade will be based on.
Suggestions on Writing Style
Your paper should be as long as is necessary to explain the problem,
your solution, the reasons for your choices, and your analysis of your
solution. It should be no longer than that. The body of your paper
must not exceed twelve 11-point, single-spaced pages in
length. Please use 1-inch margins. In general, your paper's style and
arrangement should be similar to the papers we've read in class.
A good paper begins with an abstract. The abstract is a very short
summary of the entire paper. It is not an outline of the
organization of the paper! It states the problem to be addressed (in
one sentence). It states the essential points of your solution,
without any detailed justification. And it announces any conclusions
you have drawn. Good abstracts can fit in 100-150 words, at most.
The body of your paper should expand the points made in the
abstract. Here you should:
- Introduce the problem and the externally imposed constraints.
- State the goals of your solution clearly.
- Describe the design of your solution.
You may wish to divide the description into a high level
architecture and a set of lower-level implementation decisions.
This would be a good place for pictures and diagrams.
- Analyze how well the system you built fulfils your goals.
Depending on your system, the analysis might deal with
performance in the sense of throughput or running time;
but keep in mind that factors such as reliability and
useability may be as or more important goals than
performance for some systems.
- Briefly review related work in the area of your project.
The goal is to show either how you extended existing work
or how you improved on it.
- Conclude with a review of lessons to be learned from your work.
- Document your sources, giving a list of all references (including
personal communications). The style of your citations
(references) and bibliography should be similar to the styles in
the technical papers you're reading in this class. In
particular, a bibliography at the end and no citations in the
text of your paper is insufficient; You should show what
specific pieces of information you learned from where.
Write for an audience that understands basic OS and network concepts
and has a fair amount of experience applying them in various
situations, but has not thought carefully about the particular problem
you are dealing with.
How will your paper be graded?
Your paper will be graded on both content and writing.
Some content considerations:
-
Do you provide motivation for why the problem you chose is
worthwhile or interesting?
-
Does your solution address the goals you stated?
-
Do you explain your decisions and the trade-offs?
-
How complex is your solution? Simple is better, yet sometimes simple won't
do the job. But unnecessary complexity is bad.
-
Does your solution fit well with the rest of the system? If your solution
requires modifying every piece of hardware, software, and data in sight,
it won't be credible, unless you can come up with a very good story why
everything needs to be changed.
-
Is your analysis clear?
Some writing considerations:
-
Is the report easy to comprehend?
-
Is it well organized and coherent?
-
Does it use diagrams where appropriate? (A frequent problem when people
use word processors is that they try to express everything in words, either
because the word processor doesn't make it easy to include diagrams, or
they haven't ever learned how to use the drawing features. Pictures can
communicate some ideas far better.)
-
Does it use the concepts, models, and terminology used in the course?
If not, does it have a good reason for using a different universe of discourse?
-
Is there a good abstract and bibliography?
Make sure you save enough time to write a good paper, since that's
what will determine your grade!