The Mathematics of Communication: lecture #21

Lecture #21: Beginning authentication

I reminded people that a recent Rutgers graduate working at the NSA would be discussing her work Friday, and invited them to come. (None of the students did attend, actually.)

Then I outlined how browsers work when asked to exchange information securely: they exchange a key via an RSA or Diffie-Hellman protocol, and then they use common bitstream generators on each side to xor information that passes back and forth. Key exchange is sometimes quite slow. The bitstream generators are much faster. The use of the latter is as much a matter of psychology and convenience -- if people wait too long, they are not likely to use the facility.

One bitstream generator in current use is RC4. A consequence is worry about how "random" the pseudorandom bitstream given by RC4 is. People have critiqued its behavior a great deal. Marked deviations from statistically random behavior could be horrible in practice: easily exploited. But bitstream generators are quite fast. Since I did not discuss or allude to shift registers, specific discussion of the workings of bitstream generators might have been difficult. I did remark, however, that in reality "depths" with RC4 can occur, and leave communication rather insecure.

I also showed students profiles for English and German (source: Bauer, Decrypted Secrets): letter frequencies, digram (double letter) frequencies, and even trigram frequencies. Again, any roughness in distribution could be used to help decrypt bitstreams, as in the homework. Why were these tables laboriously compiled? Can one use such statistics to trace the evolution of a language?

I mentioned that we would go away from crypto for a bit: "Trust me" -- I wanted to discuss how to authenticate a computation.

I wrote a fairly large multiplication (12 digits by 5 digits) and as answer I wrote, first, a very small number. I asked if that was correct and was told it was too small. I wrote a very long (many, many digits) answer, and was told it was too large. Then I wrote what I told them was almost the correct answer, and asked how we could check if it was correct. This answer had the wrong parity (last digit odd when it should have been even or something). Then I got the correct last and first digits and gave another answer whose correctness I questioned. It turned out almost no student in the class had heard of "casting out 9's". I described the method, using the fact that 10 mod 9 is 9+1 mod 9 is 1 mod 9, so 10ⁿ mod 9 is just 1. This I applied concretely to a 4 digit number to get another, smaller number which had the same remainder when divided by 9. I used casting out 9's to show that my suggested answer was incorrect/invalid. The computation was false, without doing a substantial amount of work. I talked a bit about the history of casting out 9's (it is many centuries old) and about the logic of using it (it could only be used to truly find errors but could diagnose false positives). I asked if there were any other numbers linked to 10 which I could use in checking computations. 11 was suggested. Since 10 mod 11 is 11-1 which is -1 mod 11, apparently to get remainders mod 11 I should alternately add and subtract digits. Thus another check of "authenticity" of the arithmetic computation could be made.

I remarked that we now could tell about 1 digit errors (casting out 9's would detect this) and transpositions (interchanging adjoining non-identical digits -- detectable by casting out 11's). A more elaborate scheme might allow both kinds (both common kinds!) of error detection. I gave out 4 copied pages (the ISBN section) of Codes Galore by J. Malkevich, G. Froelich, and D. Froelich. ISBN numbers (a variant of casting out 11's) are explained well there. I tried to verify the ISBN number of a book a student had.

I handed out a homework assignment [PDF|PS|TeX] .

Previous lecture

Table of contents

Next lecture