Hints for assignment 3

Importing library packages

You will need to import some libraries for this assignment. To get the random number generator, for example, you must say

       Import "misc/random".

and to get the select function (which you might find useful), you need

       Import "collect/search".

The documentation tells you which packages to import for each function. See the library by category for brief descriptions of library functions.

Getting the prime numbers p and q

You need to select two prime numbers p and q in the range from 257^S to 257^S+1. To select a random prime number, select a random start value k in the range from 257^S to 257^S+1, and build the infinite list [k, k+1, k+2, ...]. Then select the first member of this list that is prime. (Don't worry about the rare case where the result is larger than 257^S+1. It is rare, and won't matter anyway.) Useful functions are select (package collect/search.ast) and prime? (package math/prime.ast). Build the list [k,k+1,...] using [k,...]

Be sure to select p and q independently, not from the same infinite list. Write a function that selects a random prime number in a given range, and use it twice.

See the next section for how to get a random number.

Warning. The ^ operator has higher precedence than the + operator. Pay attention to what you write. 257^S+1 is not the same as 257^S+1.

Getting "Random" Numbers

Package misc/random.ast provides a pseudo-random number generator. It starts with a seed value provided to it, and produces a fairly long sequence of random numbers from that seed. If x is a string of decimal digits, then

      SetRandomSeed x.

sets the seed to x. String x should have at least 40 digits, although it can be shorter than that. See random number generator.

You can use the string provided by the user as the seed. The seed is required to be a string of decimal digits, not an arbitrary string. But you can turn any string s into a string of decimal digits using your stringToNumber function. Compute $(stringToNumber s), since stringToNumber converts a string to a number, and built-in function $ converts a number into its decimal representation.

After setting the seed, you can use the random number generator to select the prime numbers p and q. Only set the seed once, not every time you want another pseudo-random number. Use function randomRange to select a random number. Expression randomRange(x,y) produces a "randomly" generated number between x and y inclusive.

Note: If you use a seed that comes from the user, then the numbers generated will depend on the seed that the user selected. This is where the dependence on the user's key comes in. A different key yields a different seed which yields different numbers p and q.

Note: The random number generator is not functional. Each time you call randomRange(x,y), a new pseudo-random number is created. This is one of the small non-functional aspects of your program.

Getting phi and e

After selecting prime numbers p and q, you will need to compute n, phi, e and d. When you compute phi, use operator -- instead of -, to make the compiler know that the answer is not a negative number. That is, phi = (p--1)*(q--1). (x -- y is |x-y|.) Use the select function to select e from the list [3,4,5,...]. You are guaranteed to find a suitable value for e very quickly. What is the predicate that e must satisfy? The gcd function is standard.

Getting d

The number d is chosen so that d*e mod phi = 1. Do not try to find d by searching for it. There will be far too many values to try. There is a well-known algorithm, called the Extended Euclid Algorithm, that can find d. The details of that algorithm are beyond the scope of this class. However, the algorithm has already been implemented for you.

You can think of arithmetic mod m, for any m, as defining alternative operations +, - and * to the usual ones. You define x + y in this alternative arithmetic as (x + y) mod m. For example, mod 14, you say that 5 + 12 = 3. This is the kind of arithmetic (mod 12) that you perform when you use a clock (with the exception that the numbers are 1-12 rather than 0-11). For example, if you add an hour to 12 o'clock, you get 1 o'clock. So, in clock arithmetic, 12+1 = 1 and 10 + 4 = 2.

If we agree to do arithmetic mod phi, then the characteristic desired for d is that d*e = 1. But that is the usual requirement of the reciprocal of e. (The reciprocal of a number x is a number y such that x*y = 1.) So, in a sense, d is the reciprocal of e. But it is not the usual reciprocal, since we are using an unusual form of arithmetic. You cannot use the 1/x key on your calculator to find it. Still, this seems to involve some kind of analog of division in the alternative arithmetic.

Import package math/modular.ast. It defines a type Modular of numbers on which the alternative arithmetic is done. Expression (x _modulo_ m) produces number x, where arithmetic on x is done mod m. The operations of +, -, * and / are all defined for type Modular. There is a unary version of / that takes the reciprocal. So /x is the reciprocal of x. To get back from a Modular number to an ordinary number, you can use operator _modulo_ in a pattern. So to compute d from e you can write

      Match ?d _modulo_ phi = /(e _modulo_ phi).

That is, d is the reciprocal of e when arithmetic is done mod phi.

Enciphering and deciphering

To compute the encipher and decipher functions, you will need to perform exponentiation. Exponentiation is available as operator ^. But if you write expression (x^d) _mod_ n, you will run into a very serious problem that you cannot ignore. Imagine that x and d are large, say about 10⁵⁰. How large is x^d? It is far too large ever to be stored in any computer that anyone will ever build. To get around this, you must reduce mod n at every step. That is, every time you do a multiplication, you must take the result mod n. That keeps the numbers small (and produces the correct result). Try computing 3⁸ mod 5 this way. Each time you do a multiplication, take the result mod 5. Check your answer with a calculator.

There is another serious problem as well. When you computed 3⁸, you probably did 7 multiplications. To compute 3²⁰, you would do 19 multiplications. But if you try to compute x^d using d-1 multiplications, where d is about 10⁵⁰, then you find that the computation does not terminate within your lifetime. (10⁵⁰ is very roughly the number of protons in the Earth. You cannot expect to do that many multiplications.) A better algorithm is needed. What you do is repeated squarings to get large powers rapidly.

If you use the exponentiation provided in package math/modular.ast, then this will be taken care of. To compute m = (x^d) mod n, write

  Match ?m _modulo_ n = (x _modulo_ n)^d.

Now you are using the modular arithmetic exponentiation function, which is efficient, and then extracting the result m as a natural number. See here for the algorithm that is used.

Computing listToNumber

You can write function listToNumber directly, using recursion. You can also write it without recursion.

listToNumber is a kind of fold. To see this, consider the case where the base is 10. Define a binary operator $ so that x $ y = 10*x + y. What x $ y does is shift x to the left by one digit, and then add y. So, for example, 234 $ 5 = 2345.

Notice that 2345 = 0 $ 2 $ 3 $ 4 $ 5, where the $ operations are done from left to right. So listToNumber is a left-to-right fold, and you can use function foldLtoR to compute listToNumber.

The definition of operator $ depends on the base (10 in the example). If you define function dol so that dol b (x,y) = b*x + y, then $ is just (dol 10). You should be able to see how to do the fold.

Computing numberToList

You can use recursion to define numberToList, and you should feel free to do that. Try some examples in base 10 to see what needs to be done, and generalize to arbitrary bases. (What works in base 10 will work in other bases too.)

An alternative is to recognize that, by the principle of duality, since listToNumber is a left fold, its inverse numberToList is a right unfold. For decimal aritmetic, you have used function f(x,y) = 10*x + y. You unfold by the inverse of this function, where f^-1(10*x + y) = (x,y). You will not be able to write the function definition in exactly this form, since the pattern matcher does not know how to solve equations of the form 10*(?x) + ?y = n. But you should be able to see how to write this function. Use _mod_ and _div_.

You can use function unfoldRtoL to do the job. You will need to read the documentation of unfoldRtoL to use it.

Getting a line of text

Expression readLine() produces a line of text that is read from the standard input. This is a nonfunctional aspect, but is an easy way to get the input.

Expression skipWS(x) is the result of stripping all leading white space from the front of string x. For example, skipWS(" abc ") = "abc ". How can you strip white space from both ends? (Hint: use reverse.)

Functions skipWS and readLine are both defined in package collect/string.ast, and you will need to import that.

Breaking up the input string

You need to break the input string into a list of strings, where each string in the list is short enough to encipher. Lists are typically read using getters.

A getter is a function that takes a list (such as a string) and extracts something from the front of the list. It returns a pair consisting of the value extracted and the rest of the list. For example, if function get2 is designed to remove two characters from the front of a string, then get2("abcdefgh") = ("ab", "cdefgh").

Higher order function repeatGet takes a getter and uses it repeatedly until the list is exhausted. For example, (repeatGet get2 "abcdefgh") = ["ab", "cd", "ef", "gh"]. What repeatGet has done is run get2 to extract the first two characters, then run it again to get the next two characters, etc.

You will want a library function called getUpToN. It is a curried function. Expression (getUpToN k) produces a getter that extracts k characters from the front of the list. (If there are not k characters remaining, it gets them all.) So, for example, getUpToN 3 "abcdefg" = ("abc", "defg").

Writing a list to a file

You can use WriteFile to write an entire file (open, write, close all in one). To make the file called f hold string s, use

       WriteFile f, s.

WriteFile is an imperative operation. Function $ will convert a list to a string. For example, $([23,4,567]) = "[23,4,567]". Use that to print a list.

Reading a list from a file

If you have written a list of numbers into a file called f, you can get the list back by doing a pattern match against the contents of the file. What you wrote to the file was $(L) for some list L, so use

     Match $(?x:[Natural]) = infile(f).

to bind x to the list of natural numbers that was written into file f. You must tell the compiler what type of thing to look for, since it has no other way of determining what you want.

At some point you will want to concatenate all of the strings back together, to reverse the process of cutting up the file into short pieces. How can you do that?

Getting the key triple

Use pattern matching to get the key triple from a key file. If you write key (2S, n, e) using

   WriteFile keyfile, $(2*S,n,e).

then you can get them back from file keyfile using

   Match $(?twoS,?n,?e) = infile(keyfile).

Sections below are for information only
and are not required for implementing this program

Factoring and security

See factoring and security for a discussion of security of the RSA system.

Other security issues

See here for a discussion of problems with how the RSA system is used in this assignment, and how they can be fixed.

Checking primality

The method of checking whether a number is prime is important, since the RSA system needs very large prime numbers. See checking primality.

Practical encryption

See here for a discussion of practical use of the RSA system.