Distributed Systems Lab 1: Introduction to RPC

This lab will introduce you to programming with RPC, which is commonly used in distributed systems.

Labs in this course will assume that you have access to a Linux machine on which to compile and run the code, and course staff will also be using a Linux machine to build and grade your code. If you don't already have access to a Linux machine, you can use the cardinal.stanford.edu machines provided by ITSS. Refer to this ITSS guide for instructions on how to log into these machines remotely.

The code provided for this lab is largely written in C++, and uses the C++ Standard Template Library (STL). The reference materials page includes links to helpful references for the C++ language itself and for the STL.

Part 1: Downloading and building the lab code

To start with, you should download the initial code for this lab, which may be found here:

http://www.scs.stanford.edu/07wi-cs244b/lab1.tar.gz

Before continuing with the lab, make sure you can build the code and run the resulting executables, as follows:

% wget http://www.scs.stanford.edu/07wi-cs244b/lab1.tar.gz
...
% tar zxf lab1.tar.gz
% cd lab1
% make
rpcgen -M cf.x
rpcgen -M -m cf.x > cf_svc.c
cc -g   -c -o cf_clnt.o cf_clnt.c
cc -g   -c -o cf_xdr.o cf_xdr.c
g++ -g   -c -o cfc.o cfc.cc
g++ -o cfc cf_clnt.o cf_xdr.o cfc.o -lpthread
cc -g   -c -o cf_svc.o cf_svc.c
g++ -g   -c -o cfd.o cfd.cc
g++ -g   -c -o cfd_ops.o cfd_ops.cc
g++ -o cfd cf_svc.o cf_xdr.o cfd.o cfd_ops.o -lpthread
% ./cfd
Usage: ./cfd port-number
% ./cfc
Usage: ./cfc server port command ...
Commands:
  null
  read pathname
  write pathname data
  mkdir pathname
  mkfile pathname
  rm pathname
  ls pathname
%

On the ITSS cardinal.stanford.edu machines, you may get an error message about a missing libstdc++.so.6. To fix it, run setenv LD_LIBRARY_PATH /usr/pubsw/lib if you are using csh or tcsh (or if using bash, run export LD_LIBRARY_PATH=/usr/pubsw/lib). You may also want to add this command to your ~/.cshrc or ~/.bash_profile file so that you don't need to run it every time you log in.

Part 2: Anatomy of an RPC application

Now that you have successfully set up your build environment, let's look at the actual application in more detail. This lab will be based around a simple distributed file store called cf (which stands for class file-store). The code provided to you consists of three main parts:

The client, called cfc (which stands for cf client)
The server, called cfd (which stands for cf daemon)
The definition of the RPC interface used by the client and server to communicate, in cf.x.

Try to run the code we have provided; some of the functionality is missing (such as the ls command, which you will be adding later in this lab), but you should be able to create, read, and write files. To start the cfd file server, give it a TCP port number that it should listen on as the first argument. Providing a port number of zero will cause it to choose an arbitrary available port number. You should then be able to use the cfc client to talk to your file server and create files, as follows:

% ./cfd 0 &
[1] 536
Listening on port 51730
% ./cfc localhost 51730 mkfile /hello
% ./cfc localhost 51730 write /hello world
% ./cfc localhost 51730 read /hello
world
% ./cfc localhost 51730 read /world
Server reports error: No such pathname (1)
%

If you can't get the provided code to work as above, something may be wrong, and you should contact the course staff.

Now let's look at the code. First, read through the interface definition in the file cf.x. The file starts out by defining some data types that will be used by cf, such as error codes, pathnames, and so on. Then, using these data types, the file declares the CFS_PROG RPC program, and specifies the operations that it supports.

Exercise 1. Read through the interface definition provided in the file cf.x. What are some of the inherent limitations of this interface as it's defined? Would it be possible to implement a traditional file system on top of it? What operations are missing?

Place your answers in a file called answers.txt in the lab1/ directory; an empty answers.txt file should already exist there.

As you may recall from lecture, the interface definition is translated into real code by a special RPC interface compiler. For this lab, we will use the rpcgen compiler. Given the cf.x input file, rpcgen produces the following output files:

cf.h contains the C data types and RPC stub prototypes for our interface. Look through this file to understand how the interface we have defined corresponds to real C data structures and functions that we will be invoking. How does rpcgen choose to represent the union cfs_readdir_res data type we have declared?
cf_xdr.c contains code to marshal and unmarshal our data structures to and from the network. The system provides routines to handle some basic data types, such as integer values, strings (xdr_string), and arrays (xdr_array), which are then used by rpcgen to marshal and unmarshal more complicated user-defined data structures. How does rpcgen handle the union cfs_readdir_res structure?
cf_clnt.c contains client-side stubs for each RPC function that our interface defines. For instance, our interface defined a CFS_READ operation, and the client-side stub for this operation is called cfs_read_1(). All of these stubs use a system-provided function called clnt_call, and pass to it the input arguments, as well as pointers to xdr_ functions for marshaling and unmarshaling the input and output arguments. The clnt_call function will marshal the input arguments, send them over to the RPC server specified by the CLIENT *clnt argument, and receive and unmarshal the response.
Finally, cf_svc.c contains the dispatcher code for server-side functions. Its job is to take a request, unmarshal it, and invoke the appropriate function for executing the RPC request. For instance, for the same CFS_READ operation, the dispatcher code will invoke cfs_read_1_svc(). When the function returns, the dispatcher will marshal any data returned by the function and send it back to the client.

Now let's turn to the code that actually uses these interface stubs. The cfd server implements an in-memory file store and provides access to it through the cf RPC interface. The server consists of three source files:

cfd_ops.cc implements each of the RPC functions we defined as part of our interface in cf.x. These functions are invoked by the dispatcher in cf_svc.c that was generated by rpcgen.
cfd_fs.hh implements a simple in-memory file store with directories and files, which is used by the cfd_ops.cc code to actually execute the file store requests coming from the clients. Each node in the file store tree is represented by a cfd_fs_node object, which can be either a directory or a file. A virtual bool is_a_directory() method allows you to check whether any given file store node is a directory or a file.
Finally, cfd.cc provides the main() function for the file server, which binds to a TCP socket, listens for incoming connections, and runs the dispatcher (from cf_svc.c) when a client connects and sends a request.

There are also a few helpful abstractions we have included that hopefully simplify programming:

scopeguard.hh provides a C++ scope_guard template object which execudes a pre-defined function in its destructor. This is useful if you want to execute a cleanup function, such as free(ptr), whenever the control flow leaves a particular scope (either because of a return statement or because an exception was thrown). The advantage of using scope_guard is that you don't have to worry about putting the free(ptr) statement before every return. For example, in the following code, it makes sure that buf is always deallocated:
```
int do_something(...) {
    char *buf = malloc(1024);
    scope_guard<void, void*> cleanup(free, buf);

    /* ... */
    if (error)
	return -1;

    /* ... */
    return 0;
}
```
scopedlock.hh provides a scoped_pthread_lock object, which locks a specified mutex in its constructor, and releases it in the destructor. Similar to scope_guard, this ensures that your code always releases the mutex, regardless of whether your function uses a return statement or throws an exception. Look at cfd_ops.cc for an example of how the scoped_pthread_lock object is used.

Exercise 2. Read through the provided server code. Why do all of the RPC functions in cfd_ops.cc lock a mutex lock for the duration of their execution? Will this mutex be held while the RPC function is receiving or sending data to or from a slow client, thereby prevent other clients from making any progress? Place your answers in the answers.txt file.

The client is implemented by code in cfc.cc. Based on the command-line arguments, the client connects to the server, invokes the appropriate RPC client stub with the right arguments, and prints the response (if any) back to the user.

Exercise 3. Read through the provided client and server code. When you run cfc with a read command to read a non-existent file, an error message is printed, indicating that there is no such file. Where is this error generated in the cfd server, and how does it flow from there in the server process to the client and to the printf statement which prints it to your screen? Again, place your answers in the answers.txt file.

Part 3: Implementing new functionality

You may have noticed that some functionality, such as the CFS_READDIR interface function, corresponding to the cfc ... ls command, is not fully implemented. In particular, the code in the cfs_readdir_1_svc() function in cfd_ops.cc is largely missing. It will be your job to fill in the missing code to make cfc ls work. The in-memory file store code in cfd_fs.hh already provides a cfd_fs_dir::readdir() method which returns an STL std::vector of names in that directory. Use the STL reference from the reference materials page to refresh your memory on how to work with STL's std::vector.

We have provided a test script called readdir-test.sh for you, which will run some simple tests on your file server to see if it appears to implement the CFS_READDIR operation reasonably well. If the test script seems to fail when running against your file server, you can see what operations it's issuing by running it as sh -x ./test-readdir.sh host port.

Exercise 4. Fill in the code for cfs_readdir_1_svc() to implement the CFS_READDIR operation. Test your code using cfc ... ls and the provided readdir-test.sh script to make sure your code works.

Now that your ls command works, you may also want to learn more information about the files shown to you by the ls command, much like what the ls -l command shows on Unix.

Exercise 5. Implement an "ls -l" operation which reports not only the names of the files in a directory, but also whether these names correspond to files or directories, and their size (if a file).

In doing this exercise, you will need to augment the interface in cf.x to transmit additional information over the network. What are the different ways you can extend the interface to accomplish this goal? For example, you may be able to change some existing calls, such as readdir, or you may introduce new calls to fetch the type and size for a pathname. What are the advantages and disadvantages of the different approaches you can think of? Put your answers to this question in the answers.txt file.

To complete this exercise, you will also need to modify both the server, to support your modified RPC interface, and the client, to provide a new "ls -l" command. At the end of this exercise, your cfc client should be able to list the contents of a directory at the server, and print the type of each entry (file or directory) and the size of each file.

Challenge! The current interface reads and writes the entire file at the same time, making it prohibitively expensive to operate on large files. Extend the interface to support reading and writing ranges of a file, and implement the corresponding server-side and client-side code. This challenge is an optional part of the lab.

Part 4: Turn in your lab

To turn in your answers for this lab, you must package up your code and answers.txt and submit it to cs244b-staff@scs.stanford.edu by the turn-in deadline. You can package up your code and answers.txt using make submit.tar.gz, and either send the resulting submit.tar.gz file to us as an attachment, or use make turnin to automatically mail it to us:

% make submit.tar.gz
tar -zcf submit.tar.gz Makefile *.[chx] *.{cc,hh} answers.txt
% make turnin
tar -zcf submit.tar.gz Makefile *.[chx] *.{cc,hh} answers.txt
uuencode submit.tar.gz submit.tar.gz | mail cs244b-staff@scs.stanford.edu,username@stanford.edu
%

Make sure that you receive the copy of your submission sent to your email address (username@stanford.edu) if you use make turnin.