The final lab assignment for the class is to undertake a mini research
project of your choice in a group of 1-3 people.
Your project should be guided by the following deadlines:
- By Friday, February 17 you should form a
project team of 1-3 people (preferably 2) and email the course
instructor to let me know with whom you will be working.
- Before Wednesday, February 22 you should
schedule a meeting of your team with me (David Mazières) to
discuss your proposed project. Once I have approved your proposal,
send me a short (1 paragraph) description of what you want to do,
which I might post on the web site so everybody knows what the
different projects are.
Please schedule your meeting sooner rather than later, in case your
team needs to iterate on the proposal. Note, in particular, that it's
fine to meet with me if you don't have a concrete plan yet, as I can
suggest some things for you to look into.
Note: If you want to combine your project with your research or with
work for another class, this is in general fine, but please let me
know at the time of the proposal.
- By March 22 you must email me a paper
describing and evaluating your project. The paper should be no more
than 10 pages in at least 11-point font. I may post the papers on the
class web site.
- At 12:15pm Thursday, March 23, you will
present your project to the class and demo what you have done. Your
combined talk and demo should take no more than 20 minutes. You must
also submit the source code to your project at this time. Note that
this is a hard deadline, since I have to reserve a room for the
presentations and submit grades for people.
Here are some ideas you might be interested in for projects. This
list is by no means exhaustive.
Note, that if you want to do kernel-level work, (for instance for a
performance evaluation), we have a limited amount of lab space and
equipment you can use.
Implement a distributed file system in which the server supports
volume migration with copy-on-write snapshots, like AFS.
Build a replicated storage server.
Make some modifications to the NFS3 protocol to support better
performance over high-latency, wider area networks and scalability to
many clients. Implement the modified protocol in an open-source
operating system kernel, or at user-level using the xfs device driver
from the ARLA
Build a distributed file system layered on top of a simple block store
abstraction (like Frangipani on Petal). Feel free to look at other
classes' labs for some ideas on how to do this.
Modify OpenAFS to add features
like security and maybe automounting self-certifying pathnames.
Build a distributed file system in which clients on a LAN cooperate to
share their buffer caches, since network accesses are much faster than
Add support for integrity checking or group-readable files to CCFS.
Add support for compressing plaintext files before encrypting them in
Design and implement a new buffer cache management scheme. Evaluate
it's performance on various workloads (e.g., large databases),
compared to regular Linux or BSD.
Implement a file system that periodically (or consistently) snapshots
users' files, so they can recover from accidental deletion.
Design and implement a kernel-level file system that takes advantage
of rarely used hardware functionality (maybe the ability to have
non-standard sector sizes).
Build a file-system front-end to another data access system, like the
web, or a version control system.
Build a cooperative backup system (see, for instance,
Build an append-only (or append-mostly) file system that elides
redundant data to make it reasonable to dump backups on to optical
medial like DVD+R on a daily basis. See
and Fossil for some ideas on this.
Build or modify an existing file system to make it re-organize data
placement to optimize performance. For instance, if a particular HTML
file is always accessed in conjunction with several image files, they
might be located near each other to reduce latency. Data that often
causes buffer cache misses might even be replicated on several parts
of the disk.
Build a storage infrastructure for accessing massive data sets in
distributed computations over the wide area network.
Extend an existing file system to make NFS service faster through a
small amount of non-volatile RAM.
Design and implement a convincing file system benchmark.
Design and implement a user-level file reconciliation protocol that
merges changes to distributed copies of a directory tree.
Build a facility for persistent objects in C, C++ or some other
high-level language. Use write-ahead logging to survive crashes.
Optionally, you might extend the system to work across the network
(perhaps using NFS combined with some user-level locking protocol).
Build a storage system that scales to many readers. For example, you
might want to have clients upload data to each other, as in BitTorrent. Or you might target
media files, like radio or TV shows, in which case users will want to
play the file while downloading, and BitTorrent's random download
order will not work.