Hints for assignment 3

Getting the key information

Suppose you ask the user for a key. The user types
  T`was brillig and the slithy toves
You want to use this string as the seed to a psuedo-random number generator. The one provided to you in the library wants a decimal string as its seed. So you
  1. Convert this string of characters to a number, using your function strToNum.
  2. Convert the number to a decimal string, using standard function $.
Then you feed the resulting string into function SetRandomSeed.

Now you want to get two prime numbers p and q. It is important to think ahead here. When you get around to doing the encryption or decryption, you will need to chop a file into blocks, where each block is encrypted as a single unit. Decide right now how large those blocks will be. Suppose that you decide to encipher blocks of about 10 characters (much too small for high security, but fine for testing). Since you treat a string as a number in base 257, the largest number that can be written in 10 characters is just less than 257^10. You want n (=pq) to be a little larger than that, since the RSA system requires that n be larger than any number that you try to encrypt.

You might choose p and q randomly in the range [2, ..., 257^6]. That way, each should be roughly in the middle of that range, and their product should end up being larger than 257^10.

It is entirely possible (but unlikely) that you find out, after choosing p and q at random and computing their product n = pq, that n < 257^10. In that case, you should either choose new numbers p and q, or (simpler) just scale back how many characters in a block. Whatever n is, you can safely use log(base 257) n characters in a block. To avoid type checking problems, compute this as nabs(floor(log 257 ~n)).

Please do not hard-code a number like 10 into your program. Instead, do something like

  Let nominalCharsPerBlock = 10.
and use nominalCharsPerBlock as the value to decide the range from which to choose p and q. That way, you can adjust this quantity as desired.

Finding a prime number

If you want to find a prime number in the range [2, ..., z], just select a random number r in that range, and then select the first prime number in the list [r,r+1,r+2,...] (an infinite list). Use library function prime? to do the primality test.

Getting the keys

You need to get n, phi, e and d. n = p*q and phi = (p--1)*(q--1). Get e by selecting a member of list [3,4,5,...] such that the selected value is relatively prime to phi. (What function do you want to pass to select?) Get d by the formula given in the assignment. Check that, indeed, e*d mod phi = 1, in case you have made a mistake.

Converting strings to numbers

You have undoubtedly learned how to convert from one base to another. That is all that is going on here, really. The language you are using has built-in operations that perform arithmetic. You are converting from base 257 to whatever the built-in base is.

Let's explore a similar problem, converting from base 16 to base 10, where we have base 10 arithmetic built-in. Number 2a34 (base 16) can be thought as the list of its digits, [2, 10, 3, 4]. (Note that we can decide to write the numbers from left-to-right if we like, rather than from right-to-left.) This list of digits represents the decimal number 2*16^3 + 10*16^2 + 3*16^1 + 4. Factoring out common factors of 16 yields

((2*16 + 10)*16 + 3)*16 + 4
It is convenient to add a starting value for a fold. Starting at 0 yields expression
(((0*16 + 2)*16 + 10)*16 + 3)*16 + 4
Suppose that function f is defined so that f(x,d) = x*16 + d. Then the number represented by hexadecimal number [2,10,3,4] is just
f(f(f(f(0,2),10),3),4)
This is a little awkward. A special binary operator can be defined, just to make this more clear. Say that # is the binary operator. Define x#d = f(x,d) = x*16 + d. Also, suppose that # is left-associative, so that several of them in a row are done from left to right. Then the number represented by hexadecimal number [2,10,3,4] is just
0 # 2 # 10 # 3 # 4
Obviously, this base conversion is just a left fold. The result is 10804.

Of course, what you really start with is a string of characters, not a string of integers. But you can convert each character to its ascii code (using function rank), and you can convert them all by doing a map. You don't want 0's in the list, so you add one to each one of them. That is, what you really want is not the ascii code of each character, but a number that is one larger than the ascii code. If you see character 'a', you want to replace it by 99. (The ascii code for 'a' is 98). How can you do that?

Converting from numbers to strings

You need to convert back from a number to a string. That amounts to nothing but reversing the base conversion process, and reversing the conversion to numeric codes. How do you convert back? Suppose you wanted to do a computation that converts the decimal number 10804 back to the list of hexadecimal digits [2,10,3,4]. The computation has to "unfold" the number 10804 into a list.

The simplest way to understand this function is to work in mathematics rather than thinking too deeply about the process involved. Suppose that number n is represented by list x (in base 16). For example, n might be 10804, and x would be [2,10,3,4]. Divide n by 16, yielding quotient q and remainder r. That is, n = 16*q + r. Then x must be y++[r], where y is the result of converting q to a base 16 list.

For example dividing 10804 by 16 yields a quotient of 675 and a remainder of 4. Converting 675 to hexadecimal yields [2,10,3]. So converting 10804 to hexadecimal yields [2,10,3,4].

Just write a recursive definition of a function that converts a number to a base 257 number (as a list of digits). Now you need to take that list of digits and convert it back to a list of characters.

Breaking up a file

In order to encipher a file, you need to break it up into blocks. If your block size is 10, then you want to take a string such as "did gyre and gimble in the wabe" and convert it to the list of strings ["did gyre a", "nd gimble ", "in the wab", "e"]. Thre is a function called pullUpToN such that (pullUpToN k) pulls a list apart into its first k characters and the suffix after those k characters. For example, pullUpToN 10 "did gyre and gimble in the wabe" = ("did gyre a", "nd gimble in the wabe"). What you want to do is repeat this function on the suffix, and keep doing it until the suffix is the empty lists. Function pulls does this repeated action, where you must tell it how to pull apart the list. (This operation is a special kind of left-unfold.) So
   pulls (pullUpToN 10) "did gyre and gimble in the wabe"
   =
   ["did gyre a", "nd gimble ", "in the wab", "e"]

Putting a file back together

How would you reverse the process of breaking a file apart into pieces. For example, how would you convert list ["did gyre a", "nd gimble ", "in the wab", "e"] into string "did gyre and gimble in the wabe"? You should be able to get this one.

Enciphering a file

You need to write a function that enciphers a block. Mapping it onto the list of blocks enciphers the entire file. To encipher a block, convert the block to a number. Then take it to the power e mod n, where e and n are the numbers that you computed for the key.

Deciphering a file

Write the enciphered file as a list of numbers, and read it back as a list of numbers. Then you just need a function that deciphers one of those numbers. You can always map that function onto the list of numbers.

To decipher a number x, you take x to the power d, mod n. From a mathematical standpoint, it would be correct to write

      Let blockDecipher(?x) = numToStr(s^d _mod_ n).
Unfortunately, even for moderate size n, this will take a VERY long time. In reality, it will exhaust the available memory before it can finish. It is EXTREMELY inefficient, both in memory and time. You must use the efficient power algorithm that is provided to you in the modular.ast package. It works by the following equations. Suppose that pow(x,k,n) is suppose to compute x^k mod n.
     pow(x, 2k, n)  = pow(x, k, n)^2 mod n
     pow(x, k+1, n) = x*pow(x, k, n) mod n
     pow(x, 0, n)   = 1                       (x not 0)
where the squaring done in the first equation is done as a single multiplication of a number by itself.

Packages

The simplest thing to do is to put the encipher program in one package and the decipher function in another. But there are several functions that both need to share. For example, both of them need to compute a key, and they want to be sure to get exactly the same key values n, e and d from a given key string. Making separate copies of functions that do these shared operations is a very bad idea. Not only does it cause duplicated code (which is not all that bad) but it opens up the possibility that you will make a modification to one of them, forgetting to modify the other (and that is bad). So put the shared functions in a package that both the encipher and decipher package import.