1. Network & Information
Security Prof. Shawkat K. Guirguis 1
Network & Information Security
Shawkat K. Guirguis
Professor of Computer Science &
Informatics
2. Other sources of random
numbers:
Network & Information
Security Prof. Shawkat K. Guirguis 2
3. Network & Information
Security Prof. Shawkat K. Guirguis 3
Long Sequences from Books
Another source of supposedly "random" numbers is any
book, piece of music, or other object of which the
structure can be analyzed.
A possible one-time pad is a telephone book.
The sender and the receiver both need access to
identical telephone books.
They might agree, for example, to start at page 35, and
use two middle digits (ddd-DDdd) of each phone
number, mod 26 as a key letter for a polyalphabetic
substitution cipher using a preagreed form of Vigenere
tableau.
This approach would not provide an unlimited number of
key digits, but it might last for a year until a new
telephone book became available.
4. Network & Information
Security Prof. Shawkat K. Guirguis 4
Long Sequences from Books
(cont.)
A similar idea is the use of any book of prose as a key.
Then, the key is the letters of the text, in order.
For example, one might select a passage from Descarte's
meditation: "What of thinking? I am, I exist, that is
certain."
The meditation goes on for a great length, certainly long
enough to encipher many very long messages.
If you wanted to encipher the message MACHINES
CANNOT THINK you would write the message under
enough of the key, and encode the message, again as
with a conventional poly alphabetic cipher.
iamie xistt hatis cert
MACHI NESCA NNOTT HINK
5. Network & Information
Security Prof. Shawkat K. Guirguis 5
Long Sequences from Books
(cont.)
It would seem as if this cipher, too, would be
impossible to break.
Unfortunately, that is not true.
The flaw lies in the fact that neither the
message nor the key text is evenly distributed
and, in fact, the distributions of both cluster
around high-frequency letters.
For example, the four letters A, E, O, and T
account for approximately 40 percent of all
letters used in standard English text.
6. Network & Information
Security Prof. Shawkat K. Guirguis 6
Long Sequences from Books (cont.)
Each ciphertext letter is really the intersection
of a plaintext letter and a key letter.
But if the probability of the plaintext or the key
letter's being A, E, O, or T is 0.4, the
probability of both being one of the four is:
0.4 * 0.4 = 0.16, nearly 1/6.
The top six letters, adding N and I, increases
the sum of the frequencies to 50 percent and
increases the probability for a pair to 0.25.
7. Network & Information
Security Prof. Shawkat K. Guirguis 7
Long Sequences from Books
(cont.)
Assuming a standard Vigenere tableau has been used,
given a piece of ciphertext, we look for frequent letter
pairs that could have generated each ciphertext letter.
The encrypted version of the message
MACHINES CANNOT THINK is
uaopm kmkvt unhbl jmed
To break the cipher, assume that each letter of the
ciphertext comes from a situation in which the plaintext
letter (row selector) and the key letter (column selector)
are both one of the six most frequent letters.
(This guess will be correct approximately 25 percent of
the time.)
8. Network & Information
Security Prof. Shawkat K. Guirguis 8
Long Sequences from Books (cont.)
The trick is to work the cipher inside out.
For a ciphertext letter, look in the body of the
table for the letter to appear at the intersection
of one of the six rows with one of the six
columns.
Find combinations in the Vigenere tableau that
could yield each ciphertext letter as the result
of two high-frequency letters.
The ciphertext u in this message could be in
row A, column u, but that is not a pair of
frequent letters, or it could be row B, column t,
but that is not a common pair, nor is Cs, Dr, Eq,
Fp, or any other pair.
Thus, we cannot say much about the plaintext
letter that produced u.
9. Network & Information
Security Prof. Shawkat K. Guirguis 9
Long Sequences from Books (cont.)
The second letter, a, could come from
row A, column a, but that is the only
plaintext-key text combination of the
letters A, E, O, T, N, I that can produce
an a.
The likelihood is 0.25 that a represents
A.
It will help to build a reduced table of
the six frequent letter rows and
columns.
10. Network & Information
Security Prof. Shawkat K. Guirguis 10
Reduced table of the six frequent
letters
a e i n o t
A a e i n o t
E e l m r s x
I i m r w x c
N n r w b c h
O o s x c d l
T t x b g h m
This table is more
useful "inside out":
a could represent A,
b could stand for N or
T, and so on.
11. Network & Information
Security Prof. Shawkat K. Guirguis 11
Working inside out
Searching through this table for possibilities, we
transform the cryptogram.
u a o p m k m k v t u n h b l j m e d
? A A ? E ? E ? ? A ? A N N ? ? E A ?
O I I T N T T I E
T T T
12. Network & Information
Security Prof. Shawkat K. Guirguis 12
Comment
This technique does not reveal the entire message, or
even enough of it to make the message MACHI NESCA
NNOTT HINK easy to identify.
The technique did, however, make predictions in 10
letter positions, and there was a correct prediction in 7
of those 10 positions. (The correct predictions are shown
in bold type.)
The algorithm made 22 assertions about probable
letters, and 7 of those 22 were correct. (A score of 7 out
of 22 is 32 percent, even better than the 25 percent
expected.)
The algorithm does not come close to solving the
cryptogram, but it reduces the 2619 possibilities for the
analyst to consider.
Giving this much help to the cryptanalyst is significant.
13. Network & Information
Security Prof. Shawkat K. Guirguis 13
Dual-Message Entrapment
We can encipher two messages at once so that
an interceptor cannot distinguish between the
messages.
One message is the real message, and another
is a realistic-looking spurious message, called
the dummy.
Assume that the sender and receiver both
know the dummy message. The dummy is then
used as a key.
The cryptanalyst may deduce both key
(dummy) and plaintext messages, but nobody
can tell from the messages which is which.
14. Network & Information
Security Prof. Shawkat K. Guirguis 14
Dual Message Entrapment
Consider the following two messages:
disregard this message
this message is crucial
Both have the same length
If one serves as the key for the other the
same ciphertext will be generated and a
successfully decrypted message still has a
50% chance of being the wrong message
15. Network & Information
Security Prof. Shawkat K. Guirguis 15
Example on dual-message
This occurs because the encryption of letter x with
key y is the same as the encryption of letter y with
key letter x. For instance, the message and key
Key (dummy) disregardthismessage
Message THISMESSAGEISCRUCIAL
can be interchanged!
The encryption of either the key or message with the
other as the key is:
wpajqejvdzlqkovvmulgp
Thus, the key cannot be distinguished from the
message.
16. Network & Information
Security Prof. Shawkat K. Guirguis 16
Summary of Substitutions
Substitutions are effective cryptographic devices.
In fact, they were the basis of many cryptographic
algorithms used for diplomatic communication through
the first half of this century.
The presentation of substitution ciphers has also
introduced several cryptanalytic tools:
1. frequency distribution
2. index of coincidence
3. consideration of highly likely letters and probable
words
4. repeated pattern analysis and the Kasiski approach
5. persistence, organization, ingenuity, and luck