In this lab, you will build a simple encryption/decryption utility. The utility will allow you to generate private key/public key pairs for an Adaptive Chosen-Ciphertext (CCA) secure Public-Key Encryption scheme. You can then exchange your public key with your friends, and they will be able to encrypt files under your public key, so that only you can recover the original plaintext.
We provide you with a simple cryptographic library (~class/src/libdcrypt-0.2.tar.gz),
containing an implementation of all the cryptographic tools you will
need for the lab.
By doing the lab, you will gain experience about how to use
Cryptography to build security properties into your system.
Software Setup
The files you will need for this lab are available in ~class/src/lab1.tar.gz. To install them in your account, type the following:
% tar xzf ~class/src/lab1.tar.gz % cd lab1 %
We have set up the appropriate libraries and header files so that
you can compile against libdcrypt on the class machines. If you are instead
working on your own machine, you should install the dcrypt library in your system.
(You will also need to have the
GNU
MP and dmalloc libraries
installed in your machine.)
Lab Specification
Your Encryption/Decryption Utility will consist of three programs:
edu_keygen
, edu_encrypt
and
edu_decrypt
. We provide you with a skeleton source
directory (~class/src/lab1.tar.gz),
containing the following files:
% ls ~class/src/lab1/
Makefile edu_decrypt.c edu_keygen.c
edu.h edu_encrypt.c edu_misc.c
%
Once you have implemented the necessary functions (see below), you will build the three programs
edu_keygen
, edu_encrypt
and
edu_decrypt
using make
:
% make
gcc -g -O2 -ansi -Wall -Wsign-compare -Wchar-subscripts -Werror -I. -I/usr/local
/include/ -c edu_keygen.c edu_misc.c
gcc -g -O2 -ansi -Wall -Wsign-compare -Wchar-subscripts -Werror -o edu_keygen ed
u_keygen.o edu_misc.o /usr/local/lib/libdcrypt.a /usr/local/lib/libgmp.a -L/usr/
local/lib/ -ldmalloc
gcc -g -O2 -ansi -Wall -Wsign-compare -Wchar-subscripts -Werror -I. -I/usr/local
/include/ -c edu_encrypt.c edu_misc.c
gcc -g -O2 -ansi -Wall -Wsign-compare -Wchar-subscripts -Werror -o edu_encrypt e
du_encrypt.o edu_misc.o /usr/local/lib/libdcrypt.a /usr/local/lib/libgmp.a -L/us
r/local/lib/ -ldmalloc
gcc -g -O2 -ansi -Wall -Wsign-compare -Wchar-subscripts -Werror -I. -I/usr/local
/include/ -c edu_decrypt.c edu_misc.c
gcc -g -O2 -ansi -Wall -Wsign-compare -Wchar-subscripts -Werror -o edu_decrypt e
du_decrypt.o edu_misc.o /usr/local/lib/libdcrypt.a /usr/local/lib/libgmp.a -L/us
r/local/lib/ -ldmalloc
%
You should now be able to create your own private key/public key pair, and use them as follows:
% ./edu_keygen my_key.priv my_key.pub % yes "test" | head -1000 > a_file % ./edu_encrypt my_key.pub a_file an_encrypted_file % ./edu_decrypt my_key.priv an_encrypted_file a_decrypted_file % diff a_file a_decrypted_file %
Now you think, "This is cool!," and decide to give a copy of your encryption/decryption utility to your friend (whose name, surprise, surprise, we will assume is Bob). Now that Bob has your program, you can exchange public keys:
% cat my_key.pub | mail bob@bobserver.org
%
Once Bob gets your email ...
bob> You have mail in /var/mail/bob... he will store your public key in a file named, let's say,
alice.pub
. Now Bob can send you highly confidential messages:
bob> echo "The deadline for lab1 is Feb. 22" > msg.plain bob> ./edu_encrypt alice.pub msg.plain msg.cipher bob> uuencode -m msg.cipher bob_msg.cipher | mail alice@aliceserver.org bob>When you get Bob's email:
% ...
You have mail in /var/mail/alice
... you will uudecode it, obtaining a ciphertext file named
bob_msg.cipher
, and then you will use your private key to
unveil Bob's top-secret communication:
% ./edu_decrypt my_key.priv bob_msg.cipher bob_msg.plain % cat bob_msg.plain The deadline for lab1 is Feb. 22
Now that you know how your encryption/decryption utility is supposed to work, let's get into doing something.
To help you get acquainted with libdcrypt, we implemented
edu_keygen
for you: take a look at edu.h
,
edu_misc.c
and edu_keygen.c
,
and make sure you understand what's going on.
We also provide you with an incomplete implementation of
edu_encrypt
and edu_decrypt
in
edu_encrypt.c
and edu_decrypt.c
. Your job is to fill in
the code for the two procedures:
void encrypt_file (const char *ctxt_fname, dckey *pk, int
fin)
(in edu_encrypt.c
)
void decrypt_file (const char *ptxt_fname, dckey *sk, int
fin)
(in edu_decrypt.c
)
encrypt_file
is to read the content from the
file descriptor fin
, encrypt it using pk
,
and place the resulting ciphertext in a file named ctxt_fname
.
The encryption should be CCA-secure (cf. first class), which is the level of cryptographic protection that you should always expect of any implementation of an encryption algorithm.
Below we describe a possible design, based on the so-called "hybrid encryption" paradigm, but you are free to follow a different approach if you want, as long as it is also CCA-secure. The "hybrid encryption" paradigm combines both public-key and symmetric encryption functions: the idea is to:
K
and encrypt it under
the public key pk
; and
K
to encrypt the actual content read from the
fin
file descriptor using e.g., AES.
For this to be secure, the public-key encryption scheme used to encrypt
K
must be CCA-secure (but you don't have to worry about
this; the public-key encryption functions included in the library you
are going to use are all CCA-secure).
The symmetric encryption part must also be CCA-secure: one good approach is to use AES in CBC-mode (described below), and then append an HMAC SHA-1 mac of the resulting ciphertext. Always mac after encrypting!
The dcrypt
library contains implementations of AES
and of HMAC SHA-1 (cf. Useful functions).
However, you should take care of using AES in
CBC-mode, as the library only gives access to the basic AES block
cipher functionality, which is not CCA-secure.
Notice that the key used to compute the HMAC SHA-1 mac must be different
from the one used by AES.
Never use the same cryptographic key for
two different purposes: bad interference could occur.
For this reason, the key K
encrypted under the public key
pk
actually consists of two pieces, one for AES and one
for HMAC SHA-1. The length of both pieces (and hence the cryptographic
strength of the encryption) is specified by the constant
CCA_STRENGTH
in edu.h; the default is 128 bits, or 16 bytes.
Recall that AES can only encrypt blocks of 128 bits, so you should use some padding in the case that the length (in bytes) of the plaintext is not a multiple of 16. This should be done in a way that allow proper decoding after decryption: in particular, the recipient must have a way to know where the padding begins so that it can be chopped off.
One possible design is to add enough 0 bytes to the plaintext so as to make its length a multiple of 16, and then append a byte at the end specifying how many zero-bytes were appended.
Thus, the overall layout of an encrypted file will be:
As for the sizes of the various components of a ciphertext file, notice that:+-----+-----------+--------------------------+---+--------+ | X_l | X | Y | W | padlen | +-----+-----------+--------------------------+---+--------+ where X = PKE (pk, {K_AES, K_HSHA-1}) X_l = length of X in bytes Y = CBC-AES (K_AES, {plaintext, 0^padlen}) W = HMAC-SHA-1 (K_HSHA-1, Y) padlen = no. of zero-bytes added to the plaintext to make its length a multiple of 16
X_l
is two-byte-long,
X
consists of X_l
bytes,
Y
(in bytes) is a multiple of 16,
HSHA-1 (K_HSHA-1, Y)
is 20-byte-long, and
padlen
is a sigle byte.
decrypt_file
is to read the ciphertext from
the file descriptor fin
, decrypt it using
sk
, and place the resulting plaintext in a file named
ptxt_fname
.
This procedure basically should just "undo" the operations performed
by encrypt_file
; for this reason,
decrypt_file
expects a ciphertext featuring the structure
described above.
Notice that reading in X
(i.e., the piece of
ciphertext that encapsulates the symmetric keys used to encrypt and mac
the actual file content) is easy, as its length is prefixed, so we
know exactly how many bytes to read.
Reading Y
(and then the mac and the pad length) is a
bit trickier: below we sketch one possible approach, but you are free
to implement this as you wish.
The idea is based on the fact that the ciphertext file ends with
21 bytes (i.e., the size of a hash + 1) used up by the HSHA-1 mac and by the
pad length. Thus, we will repeatedly attempt to perform "long reads" of
(aes_blocklen
+ sha1_hashsize
+ 2) bytes:
once we get to the end of the ciphertext and only the last chunk of
Y
has to be read, such "long reads" will encounter the
end-of-file, at which point we will know where Y
ends,
and how to finish reading the last bytes of the ciphertext.
Cipher-Block Chaining (CBC) Mode
For encrypting a stream of bytes that does not require random access,
people often employ a technique known as Cipher-Block Chaining
(CBC). To encrypt in CBC mode, one thinks of the stream of bytes as a
sequence of block, each of the size of the block cipher being used
(AES in your case); then, one XORs each plaintext block with the
encryption of the previous block before encrypting, as shown here:
If the plaintext blocks are m1, m2, ..., and the ciphertext blocks c1, c2, ..., then encryption and decryption in CBC mode are performed as follows:
ci = E(mi XOR ci-1)The first plaintext block is XORed with an initialization vector, or IV (which you can think of as c0, since there is no m0). The IV can be publicly known. It is often just 0, unless the same key will be used to encrypt multiple streams, in which case each stream must use a different IV.
mi = D(ci) XOR ci-1
edu_keygen
stores the
private key in a file with permissions set to
0600
---readable only by the owner. This protects your
private key from other users in your system, but won't prevent your
system administrator from reading it.
A better design would be to prompt the user for a passphrase, and then
use (a hash of) the passphrase to encrypt the private key before
storing it to the file.
For extra credit, modify edu_keygen
so that it prompts
for a passphrase twice, aborting if the two passphrases do not
match. (Asking for the passphrase twice protects from the case
that you mistype your passphrase---otherwise, on later access you
will remember the password you wanted to type, but may not easily
be able to figure out what you actually typed.)
You will also need to modify edu_decrypt
so that it also
prompts for a passphrase, and uses it to decrypt the ciphertext
containing the private key. Then, you should be able to run the
edu_keygen
, edu_encrypt
and
edu_encrypt
programs as follows:
% ./edu_keygen my_key.priv my_key.pub Passphrase for my_key.priv: Again: % yes "test" | head -1000 > a_file % ./edu_encrypt my_key.pub a_file an_encrypted_file % ./edu_decrypt my_key.priv an_encrypted_file a_decrypted_file Passphrase for my_key.priv: % diff a_file a_decrypted_file %You may find the following function helpful to read in the passphrase:
#include <unistd.h>
char *getpass(const char *prompt);
Displaysprompt
to the user and reads a password typed at the terminal, turning off echo so that others cannot see the password typed. The password is returned as a pointer to aNULL
-terminated ANSI C string.
Include in the handin directory a short text file called
extra-credit
with a description of the exact technique
you used to implement this feature.
Collaboration policy
You must write all the code you hand in for the programming assignments,
except for code that we give you as part of the assigment. You are
not allowed to look at anyone else's solution. You may discuss the
assignments with other students, but you may not look at or copy
each others' code.
Hand-In Procedure
You must submit two files:
To build a software distribution, run the following commands (from the directory where your source files are located):
% cd .. % tar cf edu.tar lab1/ % gzip edu.tar %(If the name of the directory containing your sources is not
lab1
, then substitute the appropriate name in the
second command above.)
To create a script file, use the script command. When you run script, everything you type gets saved in a file called typescript. Press CTRL-D to finish the script. For example:
% script Script started, output file is typescript % ./edu_keygen my_key.priv my_key.pub % yes "test" | head -1000 > a_file % ./edu_encrypt my_key.pub a_file an_encrypted_file % ./edu_decrypt my_key.priv an_encrypted_file a_decrypted_file % diff a_file a_decrypted_file % % ... % yes "test" | head -1015 > a_file % ./edu_encrypt my_key.pub a_file an_encrypted_file % ./edu_decrypt my_key.priv an_encrypted_file a_decrypted_file % diff a_file a_decrypted_file % % ^D % ^D Script done, output file is typescript %
To turn in your distribution and script file, copy the files
edu.tar.gz
and typescript
to the directory
~class/handin/lab1/logname
(where
logname is your login name on the class machines):
% cp edu.tar.gz typescript ~class/handin/lab1/`logname`/
%
If you have any problems with/question about the submission, please contact the TA or the instructor at
This completes the lab.
dcrypt
LibraryBelow is a description of some of the functions implemented in the dcrypt library that you may find useful in completing the assignment. You will need to include the dcrypt.h header file to access these functions. You may also want to take a look at these sample programs (tst.c, tst_sha1.c) to see some of these functions in action.
void putint (void *dp, u_int32_t val);
void puthyper (void *dp, u_int64_t val);
putint
function puts the 32-bit integer value of
val
into memory in big-endian order at location
dp
. dp
does not need to be aligned. The
bytes stored at dp
will be the same on big- and
little-endian machines. puthyper
is like
putint
but puts a 64-bit value into 8 bytes of memory.
u_int32_t getint (const void *dp);
u_int64_t gethyper (const void *dp);
getint
and gethyper
routines retrieve
values stored by putint
and puthyper
respectively.
char *armor64 (const void *dp, size_t len);
len
bytes from the binary string pointed by
dp
to a longer, base-64, printable ASCII string. You
will need to use this to transform random session keys (which could
contain zero-bytes) into a NULL
-terminated ANSI C string.
ssize_t dearmor64 (void *out, const char *s);
armor64
function, and return the number of
bytes that were placed at out
. The return value is
negative if the NULL
-terminated ANSI C string
s
is not the output of armor64
.
ssize_t armor64len (const char *s);
s
. If some prefix of s
represents a valid
armor64 string, then the length of such prefix is returned. Otherwise,
-1 is returned, indicating that s is not the output of armor64
.
ssize_t dearmor64len (const char *s);
s
. If some prefix of s
represents a valid
armor64 string, then the length of the decoded data that would result
by "dearmoring" s
is returned. Otherwise, -1 is
returned, indicating that s is not the output of armor64
.
The libraries you are using contain a cryptographic pseudo-random
number generator, whose state is kept in a global 16-byte array called
prng_state
. Before using the random number generator,
you must initialize it.
void prng_seed (void *buf, size_t len);
len
bytes from buf
as seed. Providing
a good seed may be a difficult task; some Operating Systems
(including FreeBSD, OpenBSD and most Linux distributions) provide
you with a source of randomness under /dev/random
(or one
of its variants: /dev/srandom
, /dev/urandom
,
etc.). If a random
device is available, you should read (at least) 128 bits from it and
use it as a seed; otherwise, as a very rough
approximation, you could supply some information about your local
machine (e.g., time of the day, PID/GID value) that is
difficult to predict.
void prng_getbytes (void *buf, size_t len);
len
pseudo-random bytes to memory at location
buf
.
u_int32_t prng_getword ();
u_int64_t prng_gethyper ();
For actually encrypting and decrypting file data, you will use the Rijndael [FIPS-197] block cipher (also called AES---Advanced Encryption Standard). Rijndael is a 128-bit block cipher. It supports two operations--encryption, and decryption. Encryption transforms 16 bytes (128 bits) of plaintext data into 16 bytes of ciphertext data using a secret key. Someone who does not know the secret key cannot recover the plaintext from the ciphertext. The decryption algorithm, given knowledge of the secret key, transforms ciphertext into plaintext.
The libraries you are using define a struct
called
aes_ctx
that you should use to hold secret keys for AES.
You should manipulate your AES secret keys with the following functions:
void aes_setkey (aes_ctx *aes, const void *key, u_int
len);
len
bytes fro
the buffer key
. The key must be 16-, 24-, or
32-long.
void aes_encrypt (const aes_ctx *aes, void *buf, const void
*ibuf);
aes_encrypt
transforms 16 bytes of plaintext data at
ibuf
into 16 bytes of ciphertext data which it writes to
buf
. It uses the secret key previously stored within
aes
using the aes_setkey
function.
void aes_decrypt (const aes_ctx *aes, void *buf, const void
*ibuf);
aes_decrypt
decrypts 16 bytes, inverting the
aes_encrypt
function.
void aes_clrkey (aes_ctx *aes);
aes
, thus wiping out the secret
encryption key that was previosly stored there. If you only used symmetric-key cryptography, you would need to exchange a secret key with all the friends with whom you want to have confidential communication. With Public-Key Cryptography, instead, you can give all your friend the same public key, and they will be able to send you encrypted content that only you can recover. Similarly, you only need to know your friend's public key to create a ciphertext that only he/she will be able to decrypt. The libraries you are going to use contain an implementation of two very well-known Public-Key Encryption schemes: Rabin [Wil80] and ElGamal [ElG86] To complete the assignment, you don't need to know anything about how these schemes work, except for the interface that they provide, which is described below:
dckey *dckeygen (const char *type, size_t k, const char
*extra);
struct
that holds a new private
key/public key pair, of the type specified by type
. The
value of type
should be one of the constants defined in
dcrypt.h
, namely
DC_ELGAMAL
or DC_RABIN
.
The cryptographic strength of the key pair generated can be tuned
using the parameter k
: a value of at least 1024 is
recommended to get a reasonable level of security.
The value of extra
should be either NULL
, or
an ASCII string representing information about the parameters that
should be used in generating the private key/public key pair.
For
this assignment, you can always supply a value of NULL
.
Notice that the dckeygen
function internally makes use of
the pseudo-random number generator provided by libdcrypt
:
therefore, you shouldn't call dckeygen
before
initializing the pseudo-random number generator with prng_seed
.
char *dcexport_pub (const dckey *key);
char *dcexport_priv (const dckey *key);
dcexport_pub
and dcexport_priv
functions
return a base-64, printable, NULL
-terminated ANSI C
string representing the public key or private key stored within
key
, respectively.
dckey *dcimport_pub (const char *asc);
dckey *dcimport_priv (const char *asc);
dcimport_pub
and dcimport_priv
functions
retrieve the dckey
that was previously exported into
the ASCII string asc
using dcexport_pub
or
dcexport_priv
, respectively.
If asc
is not of the form expected,
dcimport_pub
and dcimport_priv
return
NULL
.
char *dcencrypt (const dckey *key, const char *msg);
dcencrypt
uses the public key contained in
key
to transform the plaintext data contained in the
NULL
-terminated ANSI C string msg
into a
NULL
-terminated ANSI C ciphertext string, which is
returned as output.
The encryption algorithm used is CCA-secure.
Since dcencrypt
expects the plaintext to be a
NULL
-terminated ANSI C string, you should use
the armor64
function if you need to encrypt an arbitrary
bit string (such as a secret key for AES).
char *dcdecrypt (const dckey *key, const char *cmsg);
aes_decrypt
decrypts the ciphertext contained in the
NULL
-terminated ANSI C string cmsg
,
inverting the dcencrypt
function.
void dcfree (dckey *key);
key
.
Once you are done with a private key, you
should always wipe out the memory location where it was stored!
int dcispriv (const dckey *key);
key
is a private key; otherwise it returns 0.
The SHA-1 [FIPS-180-1] hash function hashes an arbitrary-length input (up to 2^64 bytes) to a 20-byte output. SHA-1 is known as a cryptographic hash function. While nothing has been formally proven about the function, it is generally assumed that SHA-1 is one-way and collision-resistant. These properties are defined as follows:
For someone who steals the file of password hashes, there is no known way of recovering passwords more efficient than guessing passwords and verifying the guesses. (Of course, the fact that users often choose easily-guessed passwords is a problem.)
Collision-resistant functions have many uses, stemming from the fact that the short output value effectively uniquely specifies an arbitrary-length input. One cannot recover the input from the output, but given the input, one can verify that it does, indeed, match the output. One might, for instance, implement a web cache in which contents is indexed by a SHA-1 hash of the URL. Having fixed-length names for stored content would simplify the implementation.
void sha1_hash (void *digest, const void *buf, size_t
len);
len
bytes of data at buf
, and places
the resulting 20 bytes at digest
.
Sometimes the input that you want to hash is so long that it is
inconvenient to store it entirely in memory before being able to
hash it. This is the case for example when hashing the entire content
of a file into a short digest.
For this reason, the libraries you are using allow you to process a
long input "one chunk at a time." To do that, you should use a
struct
called sha1_ctx
, which will store the
"partial digest" as you keep providing new input to be hashed. You
should manipulate sha1_ctx
struct
s with the
following functions:
void sha1_init (sha1_ctx *sc);
sha1_ctx struct
that will contain the
partial hash.
void sha1_update (sha1_ctx *sc, const void *data, size_t
len);
len
bytes at data
to the input being
hashed, but does not produce a result. Thus, one can hash a large
amount of data without having it all in memory, by calling
sha1_update
on one chunk at a time.
void sha1_final (sha1_ctx *sc, void *digest);
digest
.
Message Authentication Codes (MACs) are a symmetric-key primitive allowing you to check the integrity of the information to which the MAC is applied. Recall that encryption does not guarantee integrity! The fact that you were able to decrypt a ciphertext is not enough to be sure that nobody tampered with its content. For integrity, you should always append a MAC to the content.
You use MACs as follows. Let's say that you want to store a file on your file-server, but you are afraid that its content will be changed behind your back. Then, you use a secret key to "mac" the file, and store the resulting MAC along with the file. Now, when you check back with your file-server and retrieve your file, you will also retrieve the MAC that you appended. Then, you will use the secret key to compute the MAC again, and if the MAC you just computed is the same as the value that you retrieved from the file-server, then you are sure that nobody touched your file. This is because, if somebody had changed the file, then they should have computed the corresponding MAC in order to fool you. However, secure MACs are concocted such that, without knowing the secret key, it is computationally intractable to compute the right MAC, even after having seen a lot of valid (message, MAC) pairs.
The Keyed-Hash Message Authentication Code (HMAC) [FIPS-198a] is a secure Message Authentication Code based on the use of any cryptographic hash function, like SHA-1. The libraries you are using contain an implementation of HMAC, instantiated with the SHA-1 cryptographic hash function.
void hmac_sha1 (const char *key, size_t keylen, void *out, const void
*data, size_t dlen);
len
bytes of data at
buf
using the key key
, and places
the resulting 20 bytes at out
.
Similarly to what discussed for the case of SHA-1, the libraries you are using allow you to process a long input "one chunk at a time."
void hmac_sha1_init (const char *key, size_t keylen,
sha1_ctx *sc);
sha1_ctx struct
that will contain the
partial HMAC under the key key
.
void hmac_sha1_update (sha1_ctx *sc, const void *data, size_t
len);
len
bytes from data
to the input being
hmac'ed, but does not produce a result. Thus, one can hmac a large
amount of data without having it all in memory, by calling
hmac_sha1_update
on one chunk at a time.
void hmac_sha1_final (const char *key, size_t keylen,
sha1_ctx *sc, void *out);
out
. key
used in
hmac_sha1_final
is different from the one initially used in
hmac_sha1_init
.
[ElG85] |
T. ElGamal, A public key cryptosystem and a signature
scheme based on discrete logarithms. IEEE Transactions on Information Theory, Vol. IT-31, No. 4, pp. 469--472, July 1985. |
|
---|---|---|
[FIPS-180-1] |
FIPS-180-1, Secure Hash Standard. U.S. Department of Commerce/N.I.S.T., 1994 |
|
[FIPS-197] |
FIPS-197, Announcing the Advanced Encryption Standard. U.S. Department of Commerce/N.I.S.T., 2001 |
|
[FIPS-198a] |
FIPS-198a, The Keyed-Hash Message Authentication Code
(HMAC). U.S. Department of Commerce/N.I.S.T., 2002 |
|
[Wil80] |
H. C. Williams, A Modification of the RSA Public-Key
Encryption Procedure. IEEE Transactions on Information Theory, Vol. IT-26, No. 6, November 1980. |