Computer Science 3675
Fall 1999
Programming Assignment 3

Due: October 20, 5:00 pm.

This is an exercise to see how a nontrivial program can be written using pure functions. The goal is to write short and simple programs. In a few places you will need to use imperative operations that call for doing something, but that should be kept to a minimum. Instead, just describe what you want.

This exercise is probably the longest exercise that you will be assigned in this class. However, there is a lot of help in this handout. If you do the program step by step, you will find that the individual steps are very short and simple, and you get a working product quite quickly. Do not try to write the entire program and then test it. Test each function as you go. Failure to do this will cause you grief.

The RSA system

For this exercise, you will encipher and decipher files using the RSA system. It is an industrial strength cryptographic system that works as follows. Links to hints about how to accomplish the operations are provided. Read through the entire description before getting into the details of how to do things.

Selecting the keys.

First, you must select two sets of keys, a public set and a private set. Do this as follows.

  1. Select at random two large prime numbers p and q. (For high security, p and q should have about 100 decimal digits each. You will want to test your program with much smaller numbers than that, but it should be capable of using very large numbers.) Prime numbers p and q must be selected independently of one another, or you will lose security. Do not, for example, choose q to be the next prime number after p. Doing so will make your code very easy to break. You do not want anybody to be able to guess p and q. [Hint on getting p and q]

  2. Let n = pq, and let phi = (p-1)(q-1). Select a small number e > 2 such that gcd(e,phi) = 1. [Hint on getting phi and e]

  3. Find an integer d such that (ed) mod phi = 1. (It is guaranteed that such a number d exists.) [Hint on getting d]

The public key is the pair (n,e), and the private key is the pair (n,d). You can tell everybody about your public key if desired, but you keep the number d hidden.

Enciphering a message.

A "message" to be enciphered is an integer k where 0 < k < n. The encipher function is
encipher(k) = (k^e) mod n
where ^ indicates exponentiation. That is, take k to the power e, and take the remainder when you divide that result by n. [Hint on enciphering]

Notice that anybody who knows the public key (n,e) can encipher a number k.

Deciphering a message.

The decipher function has the property decipher(encipher(k)) = k. That is, if you decipher an enciphered message, you get back the original message. It is defined by

decipher(k) = (k^d) mod n
Notice that deciphering is the same kind of operation as enciphering. It just uses a different exponent. The function that enciphers can also decipher.

Only somebody who knows the private key (n,d) can decipher messages. It is believed to be difficult to obtain d, knowing only n and e. The only known way to do that is to find the prime factors of n, and there is no efficient method known to do that. [Notes on factoring and security]

Example

For this example, we choose small prime numbers p = 7 and q = 13. Then n = 91 (since n is 7 times 13) and phi = 72 (since phi is 6 times 12). We can choose e = 5, since 5 > 2 and gcd(5, 72) = 1. Then d = 29, since 5*29 mod 72 = 1. The encipher and decipher functions are
encipher(k) = k^5 mod 91

decipher(k) = k^29 mod 91

For example, encipher(18) = 18^5 mod 91 = 1889568 mod 91 = 44. Then decipher(44) = 103398262268135618662965386326260402241163444053559142001800343869789787916826452309190192417264131218078348833560221412981442053155592422955707936598553748950499511586893512801517568 mod 91 = 18. (This example should be enough to convince you that some care must be taken in how the encipher and decipher functions are computed. Even for these rediculously small values of p and q, quite large intermediate results show up. Fortunately, those large intermediate results can be avoided. See the hint on how to encipher.)

The Assignment

For this assignment, you will write two Astarte programs, one to encipher and one to decipher. In addition, you will write a package of functions for both to share. So there will be a total of three packages. You should find that the shared package is roughly 30-50 noncomment lines (including import lines and expect lines), and the encipher and decipher packages are roughly 20-35 noncomment lines each. Noncomment lines do not count blank lines.

Please include clear and useful comments in your packages. Write them for other people to read. Some rules of thumb about commenting are

  1. Direct your comments to someone who knows a little less than you do, but assume the reader knows the basics of the language and what fundamental library functions do.

  2. Use examples in your comments. Showing how your program processes an example can make it much clearer what is going on, if the examples are chosen well.

  3. Write in clear, complete sentences. Spell words correctly, and use correct punctuation.

  4. Write the comments into the program during development, not when you are finished with development. That way, they help you.

The programs should be fairly short and simple. You will need to import some libraries for this. To get the random number generator, for example, you must say

       Import "misc/random".
and to get the select function (which you will find useful), you need
       Import "collect/search".
The function documentation tells you which packages to import.

1. Strings and Numbers

The RSA system enciphers numbers. What you want to encipher, however, is a string. Also, you will find it convenient to use a string as the key, rather than numbers. So you need a way to convert from strings to integers and back.

A string is a list of characters. Each character has a code, which is an integer. Function rank will give you the code of a character. For example, rank('a') = 97. If you map rank onto a string, you get a list of numbers. For example, map rank "abc" = [97, 98, 99].

Write a function listToNum so that listToNum b x takes a base b and a list of digits x, and produces the number that x represents as a base b number. For example, listToNum 10 [9,4,2] = 942. [Hint on listToNum]

Zero digits will cause a problem later. This is because leading zeros are not visible. For example, listToNum 10 [0,1] is the same as listToNum 10 [1]. When converting a string to a number, we do not want to lose any leading zeros in the string. To avoid this, add 1 to the code of each character in a string. Then each character stands for a number between 1 and 256 instead of a code between 0 and 255.

Now write a function strToNum so that strToNum(s) is a number that string s represents, where each character of s is thought of as standing for one larger than its code, and the string is a number in base 257.

You will also need to convert back from a string to a number. Write a function numToStr so that numToStr(strToNum(s)) = s. That is, numToStr does the inverse transformation of strToNum. The main part of this is a function numToList so that (numToList b n) produces the list of digits represented by number n, without any leading zeros. For example, numToList 10 942 = [9,4,2]. [Hint for numToList]

Put functions strToNum and numToStr in a utility package that the encipher and decipher programs can share. Your package will look something like this. Note the expectation, which tells other packages that functions strToNum and numToStr are defined here, but does not say how they work.

Package RSAUtilities

Export

Expect
  strToNum: String -> Natural;
  numToStr: Natural -> String;
%Expect

Implementation

 Your definitions go here.

%Package
After you write these functions, be sure to check them before going on.

2. Selecting Keys

The user will give a string as his or her key. You will want a function that converts that string into triple (n,e,d), where n, e and d are the values selected according to the rules described for the RSA cryptosystem. Put that function into the utility package, and be sure to expect it in the export part. Its type will be String -> (Natural, Natural, Natural). If you like, you can make the computation of d lazy, since it is not needed for the encipher function.

After you implement the key selection code, try writing the key information to standard output before proceeding. Use short strings as keys. Does it look reasonable? Is e*d mod phi = 1?

3. Enciphering

Write a program that enciphers. You should use command
  astr encipher myfile.txt myfile.cph
to place, in myfile.cph, an enciphered version of myfile.txt. The program should ask the user for a key string. To get the key string, just get an entire line from the user, and strip any white space at either end, since it is invisible, but will affect the key. [Hint on getting a line]

The command line arguments can be obtained from commandLine. In the example above, commandLine is the list ["encipher", "myfile.txt", "myfile.cph"].

The content of file myfile.txt can be obtained as the value of expression infile("myfile.txt"). The content is a string.

The string in the file to encipher will, in general, be too long to encipher. You will need to break it down into pieces of a reasonable length, so that you can encipher each piece separately. If the pieces are too short, then you have no security. (An extreme example is choosing the pieces to have only one character in them. In that case, your code is nothing more than the kind of cipher that newspaper readers are invited to break, next to the crossword puzzle, for their amusement. That is certainly not an industrial strength cryptosystem!) If the pieces are too long, the RSA system will lose information. What you require is that, for each piece s, strToNum(s) < n, where n is the value from the RSA key pair. [Hint on breaking up the input]

Get the public key pair, using the function produced above. At this point, you have made the file content into a list of strings, the individual pieces of length k. Change this into a list of numbers, by using strToNum on each string in the list. Then change the list of numbers by replacing each number x by encipher(x). This list is the enciphered version of the file. Write it to the file that should hold the enciphered text. [Hint on writing the list]

4. Deciphering

To decipher, reverse the process. Command
   astr decipher myfile.cph myfile.plain
should write, into file myfile.plain, the deciphered version of myfile.cph. It should prompt for the key string. The deciphered version should be identical to the original. Unix command diff compares files, and tells you how they differ. If you do the encipher and decipher operations shown, and then do command
   diff myfile.txt myfile.plain
then you should find no differences. (The diff command will not print anything at all to show that there are no differences.) See [Hint on reading a list] to see how to read the enciphered file in.

Turning in your program.

Email your programs (the raw programs, NOT the compiler listings), as attachments, to karl@cs.ecu.edu. There should be three attachments, one for the utility package, one for the encipher program and one for the decipher program. To make testing easy, please use the following names for the files.
  1. Utility file: RSAUtils.ast
  2. Encipher file: encipher.ast
  3. Decipher file: decipher.ast
The body of the message should be empty. The subject of the message should be
3675 assn3 your name
If you send a question by email, be sure to use a different subject, so that I do not mistake your question for a submission.

Notes

End Effects

After you break up a file into pieces for encipherment, the last segment of a file can be shorter than the rest. That is a security loophole, since it might be possible to figure out what that last segment is. Can you see how to fix this problem? (Don't bother to do it for your program.)

Export Controls

The United States Government has placed export controls on strong encryption technology, and the RSA system falls under that heading. At the time of this writing, it is illegal for you to export your program to any other country (assuming that it works)!