Chapter 2
Chapter 2
Elementary Cryptography
In this chapter
Concepts of encryption
Symmetric (secret key) encryption and the DES and AES algorithms
Digital signatures
Cryptography secret writing is the strongest tool for controlling against many kinds of security
threats. Well-disguised data cannot be read, modified, or fabricated easily. Cryptography is
rooted in higher mathematics: group and field theory, computational complexity, and even real
analysis, not to mention probability and statistics. Fortunately, it is not necessary to understand
the underlying mathematics to be able to use cryptography.
We begin this chapter by examining what encryption does and how it works. We introduce the
basic principles of encryption with two simple encryption methods: substitution and
transposition. Next, we explore how they can be expanded and improved to create stronger,
more sophisticated protection. Because weak or flawed encryption provides only the illusion of
protection, we also look at how encryption can fail.
We analyze techniques used to break through the protective scheme and reveal the original text.
Three very popular algorithms are in use today: DES, AES, and RSA. We look at them in some
detail to see how these and other algorithms can be used as building blocks with protocols and
structures to perform other computing tasks, such as signing documents, detecting modification,
and exchanging sensitive data.
Chapter 12 offers a deeper analysis of encryption techniques and algorithms, including their
mathematical bases, the mechanisms that make them work, and their limitations. Most users of
cryptography will never invent their own algorithms, just as most users of electricity do not build
their own power generators. Still, deeper knowledge of how cryptography works can help you
use it effectively, just as deeper knowledge of energy issues helps you understand the
environmental and cost trade-offs among different energy sources. This chapter offers you a
rudimentary understanding of what cryptography is; but we encourage you to study the details in
Chapter 12 to better understand the implications of each choice of cryptographic algorithm.
Block it, by preventing its reaching R, thereby affecting the availability of the message.
Intercept it, by reading or listening to the message, thereby affecting the confidentiality
of the message.
Modify it, by seizing the message and changing it in some way, affecting the message's
integrity.
Fabricate an authentic-looking message, arranging for it to be delivered as if it came
from S, thereby also affecting the integrity of the message.
As you can see, a message's vulnerabilities reflect the four possible security failures we
identified in Chapter 1. Fortunately, encryption is a technique that can address all these
problems. Encryption, probably the most fundamental building block of secure computing, is a
means of maintaining secure data in an insecure environment. (It is not the only building block,
however.)
In this book, we study encryption as a security technique, and we see how it is used in protecting
programs, databases, networks, and electronic communications.
Terminology
Encryption is the process of encoding a message so that its meaning is not obvious; decryption
is the reverse process, transforming an encrypted message back into its normal, original form.
Alternatively, the terms encode and decode or encipher and decipher are used instead of
encrypt and decrypt.[1] That is, we say that we encode, encrypt, or encipher the original message
to hide its meaning. Then, we decode, decrypt, or decipher it to reveal the original message. A
system for encryption and decryption is called a cryptosystem.
[1]
There are slight differences in the meanings of these three pairs of words,
although they are not significant in this context. Strictly speaking, encoding is the
process of translating entire words or phrases to other words or phrases, whereas
enciphering is translating letters or symbols individually; encryption is the group
term that covers both encoding and enciphering.
The original form of a message is known as plaintext, and the encrypted form is called
ciphertext. This relationship is shown in Figure 2-1. For convenience, we denote a plaintext
message P as a sequence of individual characters P = <p1, p2, …, pn>. Similarly, ciphertext is
written as C = <c1, c2, …, cm>. For instance, the plaintext message "I want cookies" can be
denoted as the message string <I, ,w,a,n,t, , c,o,o,k,i,e,s>. It can be transformed into ciphertext
<c1, c2, …, c14>, and the encryption algorithm tells us how the transformation is done.
We use this formal notation to describe the transformations between plaintext and ciphertext. For
example, we write C = E(P) and P = D(C), where C represents the ciphertext, E is the encryption
rule, P is the plaintext, and D is the decryption rule. What we seek is a cryptosystem for which P
= D(E(P)). In other words, we want to be able to convert the message to protect it from an
intruder, but we also want to be able to get the original message back so that the receiver can
read it properly.
Encryption Algorithms
The cryptosystem involves a set of rules for how to encrypt the plaintext and how to decrypt the
ciphertext. The encryption and decryption rules, called algorithms, often use a device called a
key, denoted by K, so that the resulting ciphertext depends on the original plaintext message, the
algorithm, and the key value. We write this dependence as C = E(K, P). Essentially, E is a set of
encryption algorithms, and the key K selects one specific algorithm from the set. We see later in
this chapter that a cryptosystem, such as the Caesar cipher, is keyless but that keyed encryptions
are more difficult to break.
Sometimes the encryption and decryption keys are the same, so P = D(K, E(K,P)). This form is
called symmetric encryption because D and E are mirror-image processes. At other times,
encryption and decryption keys come in pairs. Then, a decryption key, KD, inverts the encryption
of key KE so that P = D(KD, E(KE,P)). Encryption algorithms of this form are called asymmetric
because converting C back to P involves a series of steps and a key that are different from the
steps and key of E. The difference between symmetric and asymmetric encryption is shown in
Figure 2-2.
A key gives us flexibility in using an encryption scheme. We can create different encryptions of
one plaintext message just by changing the key. Moreover, using a key provides additional
security. If the encryption algorithm should fall into the interceptor's hands, future messages can
still be kept secret because the interceptor will not know the key value. Sidebar 2-1 describes
how the British dealt with written keys and codes in World War II. An encryption scheme that
does not require the use of a key is called a keyless cipher.
A cryptanalyst studies encryption and encrypted messages, hoping to find the hidden meanings.
Both a cryptographer and a cryptanalyst attempt to translate coded material back to its original
form. Normally, a cryptographer works on behalf of a legitimate sender or receiver, whereas a
cryptanalyst works on behalf of an unauthorized interceptor. Finally, cryptology is the research
into and study of encryption and decryption; it includes both cryptography and cryptanalysis.
Cryptanalysis
A cryptanalyst's chore is to break an encryption. That is, the cryptanalyst attempts to deduce the
original meaning of a ciphertext message. Better yet, he or she hopes to determine which
decrypting algorithm matches the encrypting algorithm so that other messages encoded in the
same way can be broken. For instance, suppose two countries are at war and the first country has
intercepted encrypted messages of the second. Cryptanalysts of the first country want to decipher
a particular message so that the first country can anticipate the movements and resources of the
second. But it is even better to discover the actual decryption algorithm; then the first country
can easily break the encryption of all messages sent by the second country.
An analyst works with a variety of pieces of information: encrypted messages, known encryption
algorithms, intercepted plaintext, data items known or suspected to be in a ciphertext message,
mathematical or statistical tools and techniques, properties of languages, computers, and plenty
of ingenuity and luck. Each piece of evidence can provide a clue, and the analyst puts the clues
together to try to form a larger picture of a message's meaning in the context of how the
encryption is done. Remember that there are no rules. An interceptor can use any means
available to tease out the meaning of the message.
Breakable Encryption
An encryption algorithm is called breakable when, given enough time and data, an analyst can
determine the algorithm. However, an algorithm that is theoretically breakable may in fact be
impractical to try to break. To see why, consider a 25-character message that is expressed in just
uppercase letters. A given cipher scheme may have 2625 (approximately 1035) possible
decipherments, so the task is to select the right one out of the 2625. If your computer could
perform on the order of 1010 operations per second, finding this decipherment would require on
the order of 1016 seconds, or roughly 1011 years. In this case, although we know that theoretically
we could generate the solution, determining the deciphering algorithm by examining all
possibilities can be ignored as infeasible with current technology.
Two other important issues must be addressed when considering the breakability of encryption
algorithms. First, the cryptanalyst cannot be expected to try only the hard, long way. In the
example just presented, the obvious decryption might require 2625 machine operations, but a
more ingenious approach might require only 1015 operations. At the speed of 1010 operations per
second, 1015 operations take slightly more than one day. The ingenious approach is certainly
feasible. As we see later in this chapter, some of the algorithms we study in this book are based
on known "hard" problems that take an unreasonably long time to solve. But the cryptanalyst
does not necessarily have to solve the underlying problem to break the encryption of a single
message. As we note in Sidebar 2-2, sloppy use of controls can reveal likely words or phrases,
and an analyst can use educated guesses combined with careful analysis to generate all or most
of an important message.
Representing Characters
We want to study ways of encrypting any computer material, whether it is written as ASCII
characters, binary data, object code, or a control stream. However, to simplify the explanations,
we begin with the encryption of messages written in the standard 26-letter English[2] alphabet, A
through Z.
[2]
Because this book is written in English, the explanations refer to English. However, with slight
variations, the techniques are applicable to most other written languages as well.
Throughout the book, we use the convention that plaintext is written in UPPERCASE letters, and
ciphertext is in lowercase letters. Because most encryption algorithms are based on mathematical
transformations, they can be explained or studied more easily in mathematical form. Therefore,
in this book, we switch back and forth between letters and the numeric encoding of each letter as
shown here.
Thus, the letter A is represented by a zero, B by a one, and so on. This representation allows us
to consider performing arithmetic on the "letters" of a message. That is, we can perform addition
and subtraction on letters by adding and subtracting the corresponding code numbers.
Expressions such as A + 3 = D or K - 1 = J have their natural interpretation. Arithmetic is
performed as if the alphabetic table were circular.[3] In other words, addition wraps around from
one end of the table to the other so that Y + 3 = B. Thus, every result of an arithmetic operation
is between 0 and 25.
3]
This form of arithmetic is called modular arithmetic, written mod n, which means that any
result greater than n is reduced by n as many times as necessary to bring it back into the range 0
result & < n. Another way to reduce a result is to use the remainder after dividing the
number by n. For example, the value of 95 mod 26 is the remainder of 95/26, which is 17, while
95 - 26 - 26 - 26 = 17; alternatively, starting at position 0 (A) and counting ahead 95 positions
(and returning to position 0 each time after passing position 25) also brings us to position 17.
There are many types of encryption. In the next two sections we look at two simple forms of
encryption: substitutions, in which one letter is exchanged for another, and transpositions, in
which the order of the letters is rearranged. The goals of studying these two forms are to become
familiar with the concept of encryption and decryption, to learn some of the terminology and
methods of cryptanalysis, and to study some of the weaknesses to which encryption is prone.
Once we have mastered the simple encryption algorithms, we explore "commercial grade"
algorithms used in modern computer applications.
2.2. Substitution Ciphers
Children sometimes devise "secret codes" that use a correspondence table with which to
substitute a character or symbol for each character of the original message. This technique is
called a monoalphabetic cipher or simple substitution. A substitution is an acceptable way of
encrypting text. In this section, we study several kinds of substitution ciphers.
The Caesar cipher has an important place in history. Julius Caesar is said to have been the first
to use this scheme, in which each letter is translated to the letter a fixed number of places after it
in the alphabet. Caesar used a shift of 3, so plaintext letter pi was enciphered as ciphertext letter
ci by the rule
ci = E(pi) = pi + 3
TREATY IMPOSSIBLE
would be encoded as
T R E A T Y I M P O S S I B L E
w u h d w b l p s r v v l e o h
Advantages and Disadvantages of the Caesar Cipher
Most ciphers, and especially the early ones, had to be easy to perform in the field. In particular, it
was dangerous to have the cryptosystem algorithms written down for the soldiers or spies to
follow. Any cipher that was so complicated that its algorithm had to be written out was at risk of
being revealed if the interceptor caught a sender with the written instructions. Then, the
interceptor could readily decode any ciphertext messages intercepted (until the encryption
algorithm could be changed).
The Caesar cipher is quite simple. During Caesar's lifetime, the simplicity did not dramatically
compromise the safety of the encryption because anything written, even in plaintext, was rather
well protected; few people knew how to read! The pattern pi + 3 was easy to memorize and
implement. A sender in the field could write out a plaintext and a ciphertext alphabet, encode a
message to be sent, and then destroy the paper containing the alphabets. Sidebar 2-3 describes
actual use of a cipher similar to the Caesar cipher.
Its obvious pattern is also the major weakness of the Caesar cipher. A secure encryption should
not allow an interceptor to use a small piece of the ciphertext to predict the entire pattern of the
encryption.
Let us take a closer look at the result of applying Caesar's encryption technique to "TREATY
IMPOSSIBLE." If we did not know the plaintext and were trying to guess it, we would have
many clues from the ciphertext. For example, the break between the two words is preserved in
the ciphertext, and double letters are preserved: The SS is translated to vv. We might also notice
that when a letter is repeated, it maps again to the same ciphertext as it did previously. So the
letters T, I, and E always translate to w, l, and h. These clues make this cipher easy to break.
Suppose you are given the following ciphertext message, and you want to try to determine the
original plaintext.
The message has actually been enciphered with a 27-symbol alphabet: A through Z plus the
"blank" character or separator between words.[4] As a start, assume that the coder was lazy and
has allowed the blank to be translated to itself. If your assumption is true, it is an exceptional
piece of information; knowing where the spaces are allows us to see which are the small words.
English has relatively few small words, such as am, is, to, be, he, we, and, are, you, she, and so
on. Therefore, one way to attack this problem and break the encryption is to substitute known
short words at appropriate places in the ciphertext until you have something that seems to be
meaningful. Once the small words fall into place, you can try substituting for matching
characters at other places in the ciphertext.
[4]
In fact, in most encryption schemes, spaces between words often are deleted, under the
assumption that a legitimate receiver can break most messages into words fairly easily. For ease
of writing and decoding, messages are then arbitrarily broken into blocks of a uniform size, such
as every five characters, so that there is no significance to the places where the message is
broken.
Look again at the ciphertext you are decrypting. There is a strong clue in the repeated r of the
word wrr. You might use this text to guess at three-letter words that you know. For instance, two
very common three-letter words having the pattern xyy are see and too; other less common
possibilities are add, odd, and off. (Of course, there are also obscure possibilities like woo or gee,
but it makes more sense to try the common cases first.) Moreover, the combination wr appears in
the ciphertext, too, so you can determine whether the first two letters of the three-letter word also
form a two-letter word.
For instance, if wrr is SEE, wr would have to be SE, which is unlikely. However, if wrr is TOO,
wr would be TO, which is quite reasonable. Substituting T for w and O for r, the message becomes
wklv phvvdjh lv qrw wrr kdug wr euhdn
T--- ------- -- -OT TOO ---- TO -----
The OT could be cot, dot, got, hot, lot, not, pot, rot, or tot; a likely choice is not. Unfortunately, q
= N does not give any more clues because q appears only once in this sample.
The word lv is also the end of the word wklv, which probably starts with T. Likely two-letter
words that can also end a longer word include so, is, in, etc. However, so is unlikely because the
form T-SO is not recognizable; IN is ruled out because of the previous assumption that q is N. A
more promising alternative is to substitute IS for lv tHRoughout, and continue to analyze the
message in that way.
By now, you might notice that the ciphertext letters uncovered are just three positions away from
their plaintext counterparts. You (and any experienced cryptanalyst) might try that same pattern
on all the unmatched ciphertext. The completion of this decryption is left as an exercise.
The cryptanalysis described here is ad hoc, using deduction based on guesses instead of solid
principles. But you can take a more methodical approach, considering which letters commonly
start words, which letters commonly end words, and which prefixes and suffixes are common.
Cryptanalysts have compiled lists of common prefixes, common suffixes, and words having
particular patterns. (For example, sleeps is a word that follows the pattern abccda.) In the next
section, we look at a different analysis technique.
Other Substitutions
In substitutions, the alphabet is scrambled, and each plaintext letter maps to a unique ciphertext
letter. We can describe this technique in a more mathematical way. Formally, we say that a
permutation is a reordering of the elements of a sequence. For instance, we can permute the
numbers l to 10 in many ways, including the permutations π1 = 1, 3, 5, 7, 9, 10, 8, 6, 4, 2; and π2
= 10, 9, 8, 7, 6, 5, 4, 3, 2, 1. A permutation is a function, so we can write expressions such as
π1(3) = 5 meaning that the letter in position 3 is to be replaced by the fifth letter. If the set is the
first ten letters of the alphabet, π1(3) = 5 means that c is transformed into E.
One way to scramble an alphabet is to use a key, a word that controls the permutation. For
instance, if the key is word, the sender or receiver first writes the alphabet and then writes the
key under the first few letters of the alphabet.
ABCDEFGHIJKLMNOPQRSTUVWXYZ
word
The sender or receiver then fills in the remaining letters of the alphabet, in some easy-to-
remember order, after the keyword.
ABCDEFGHIJKLMNOPQRSTUVWXYZ
wordabcefghijklmnpqstuvxyz
In this example, the key is short, so most plaintext letters are only one or two positions off from
their ciphertext equivalents. With a longer keyword, the distance is greater and less predictable,
as shown below. Because π must map one plaintext letter to exactly one ciphertext letter,
duplicate letters in a keyword, such as the second s and o in professional, are dropped.
ABCDEFGHIJKLMNOPQRSTUVWXYZ
profesinalbcdghjkmqtuvwxyz
Notice that near the end of the alphabet replacements are rather close, and the last seven
characters map to themselves. Conveniently, the last characters of the alphabet are among the
least frequently used, so this vulnerability would give little help to an interceptor.
Still, since regularity helps an interceptor, it is desirable to have a less regular rearrangement of
the letters. One possibility is to count by threes (or fives or sevens or nines) and rearrange the
letters in that order. For example, one encryption uses a table that starts with
ABCDEFGHIJKLMNOPQRSTUVWXYZ
adgj
using every third letter. At the end of the alphabet, the pattern continues mod 26, as shown
below.
ABCDEFGHIJKLMNOPQRSTUVWXYZ
adgjmpsvybehknqtwzcfilorux
There are many other examples of substitution ciphers. For instance, Sidebar 2-4 describes a
substitution cipher called a poem code, used in the early days of World War II by British spies to
keep the Germans from reading their messages.
An important issue in using any cryptosystem is the time it takes to turn plaintext into ciphertext,
and vice versa. Especially in the field (when encryption is used by spies or decryption is
attempted by soldiers), it is essential that the scrambling and unscrambling not deter the
authorized parties from completing their missions. The timing is directly related to the
complexity of the encryption algorithm. For example, encryption and decryption with
substitution ciphers can be performed by direct lookup in a table illustrating the correspondence,
like the ones shown in our examples. Transforming a single character can be done in a constant
amount of time, so we express the complexity of the algorithm by saying that the time to encrypt
a message of n characters is proportional to n. One way of thinking of this expression is that if
one message is twice as long as another, it will take twice as long to encrypt
The techniques described for breaking the Caesar cipher can also be used on other substitution
ciphers. Short words, words with repeated patterns, and common initial and final letters all give
clues for guessing the permutation.
Of course, breaking the code is a lot like working a crossword puzzle: You try a guess and
continue to work to substantiate that guess until you have all the words in place or until you
reach a contradiction. For a long message, this process can be extremely tedious. Fortunately,
there are other approaches to breaking an encryption. In fact, analysts apply every technique at
their disposal, using a combination of guess, strategy, and mathematical skill.
Cryptanalysts may attempt to decipher a particular message at hand, or they may try to determine
the encryption algorithm that generated the ciphertext in the first place (so that future messages
can be broken easily). One approach is to try to reverse the difficulty introduced by the
encryption.
To see why, consider the difficulty of breaking a substitution cipher. At face value, such
encryption techniques seem secure because there are 26! possible different encipherments. We
know this because we have 26 choices of letter to substitute for the a, then 25 (all but the one
chosen for a) for b, 24 (all but the ones chosen for a and b) for c, and so on, to yield 26 * 25 * 24
*…* 2 * 1 = 26! possibilities. By using a brute force attack, the cryptanalyst could try all 26!
permutations of a particular ciphertext message. Working at one permutation per microsecond
(assuming the cryptanalyst had the patience to review the probable-looking plaintexts produced
by some of the permutations), it would still take over a thousand years to test all 26! possibilities.
We can use our knowledge of language to simplify this problem. For example, in English, some
letters are used more often than others. The letters E, T, O, and A occur far more often than J, Q,
X, and Z, for example. Thus, the frequency with which certain letters are used can help us to
break the code more quickly. We can also recognize that the nature and context of the text being
analyzed affect the distribution. For instance, in a medical article in which the term x-ray was
used often, the letter x would have an uncommonly high frequency.
When messages are long enough, the frequency distribution analysis quickly betrays many of the
letters of the plaintext. In this and other ways, a good cryptanalyst finds approaches for
bypassing hard problems. An encryption based on a hard problem is not secure just because of
the difficulty of the problem.
How difficult is it to break substitutions? With a little help from frequency distributions and
letter patterns, you can probably break a substitution cipher by hand. It follows that, with the aid
of computer programs and with an adequate amount of ciphertext, a good cryptanalyst can break
such a cipher in an hour. Even an untrained but diligent interceptor could probably determine the
plaintext in a day or so. Nevertheless, in some applications, the prospect of one day's effort, or
even the appearance of a sheet full of text that makes no sense, may be enough to protect the
message. Encryption, even in a simple form, will deter the casual observer.
As with many analysis techniques, having very little ciphertext inhibits the effectiveness of a
technique being used to break an encryption. A cryptanalyst works by finding patterns. Short
messages give the cryptanalyst little to work with, so short messages are fairly secure with even
simple encryption.
Substitutions highlight the cryptologist's dilemma: An encryption algorithm must be regular for
it to be algorithmic and for cryptographers to be able to remember it. Unfortunately, the
regularity gives clues to the cryptanalyst.
There is no solution to this dilemma. In fact, cryptography and cryptanalysis at times seem to go
together like a dog chasing its tail. First, the cryptographer invents a new encryption algorithm to
protect a message. Then, the cryptanalyst studies the algorithm, finding its patterns and
weaknesses. The cryptographer then sets out to try to secure messages by inventing a new
algorithm, and then the cryptanalyst has a go at it. It is here that the principle of timeliness from
Chapter 1 applies; a security measure must be strong enough to keep out the attacker only for the
life of the data. Data with a short time value can be protected with simple measures.
One-Time Pads
A one-time pad is sometimes considered the perfect cipher. The name comes from an
encryption method in which a large, nonrepeating set of keys is written on sheets of paper, glued
together into a pad. For example, if the keys are 20 characters long and a sender must transmit a
message 300 characters in length, the sender would tear off the next 15 pages of keys. The
sender would write the keys one at a time above the letters of the plaintext and encipher the
plaintext with a prearranged chart (called a Vigenère tableau) that has all 26 letters in each
column, in some scrambled order. The sender would then destroy the used keys.
For the encryption to work, the receiver needs a pad identical to that of the sender. Upon
receiving a message, the receiver takes the appropriate number of keys and deciphers the
message as if it were a plain substitution with a long key. Essentially, this algorithm gives the
effect of a key as long as the number of characters in the pad.
The one-time pad method has two problems: the need for absolute synchronization between
sender and receiver, and the need for an unlimited number of keys. Although generating a large
number of random keys is no problem, printing, distributing, storing, and accounting for such
keys are problems.
A close approximation of a one-time pad for use on computers is a random number generator. In
fact, computer random numbers are not random; they really form a sequence with a very long
period (that is, they go for a long time before repeating the sequence). In practice, a generator
with a long period can be acceptable for a limited amount of time or plaintext.
To use a random number generator, the sender with a 300-character message would interrogate
the computer for the next 300 random numbers, scale them to lie between 0 and 25, and use one
number to encipher each character of the plaintext message.
The Vernam cipher is a type of one-time pad devised by Gilbert Vernam for AT&T. The
Vernam cipher is immune to most cryptanalytic attacks. The basic encryption involves an
arbitrarily long nonrepeating sequence of numbers that are combined with the plaintext.
Vernam's invention used an arbitrarily long punched paper tape that fed into a teletype machine.
The tape contained random numbers that were combined with characters typed into the teletype.
The sequence of random numbers had no repeats, and each tape was used only once. As long as
the key tape does not repeat or is not reused, this type of cipher is immune to cryptanalytic attack
because the available ciphertext does not display the pattern of the key. A model of this process
is shown in Figure 2-3.
-------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------
--------
Summary of Substitutions
Substitutions are effective cryptographic devices. In fact, they were the basis of many
cryptographic algorithms used for diplomatic communication through the first half of the
twentieth century. Because they are interesting and intriguing, they show up in mysteries by
Arthur Conan Doyle, Edgar Allan Poe, Agatha Christie, Ken Follett, and others.
But substitution is not the only kind of encryption technique. In the next section, we introduce
the other basic cryptographic invention: the transposition (permutation). Substitutions and
permutations together form a basis for some widely used commercial-grade encryption
algorithms that we discuss later in this chapter.
The goal of substitution is confusion; the encryption method is an attempt to make it difficult for
a cryptanalyst or intruder to determine how a message and key were transformed into ciphertext.
In this section, we look at a different kind of scrambling with the similar goal. A transposition is
an encryption in which the letters of the message are rearranged. With transposition, the
cryptography aims for diffusion, widely spreading the information from the message or the key
across the ciphertext. Transpositions try to break established patterns. Because a transposition is
a rearrangement of the symbols of a message, it is also known as a permutation.
Columnar Transpositions
As with substitutions, we begin this study of transpositions by examining a simple example. The
columnar transposition is a rearrangement of the characters of the plaintext into columns.
The following set of characters is a five-column transposition. The plaintext characters are
written in rows of five and arranged one row after another, as shown here.
For instance, suppose you want to write the plaintext message THIS IS A MESSAGE TO SHOW
HOW A COLUMNAR TRANSPOSITION WORKS.
T H I S I
S A M E S
S A G E T
O S H O W
H O W A C
O L U M N
A R T R A
N S P O S
I T I O N
W O R K S
Recall that the two basic kinds of encryptions are symmetric (also called "secret key") and
asymmetric (also called "public key").
Symmetric algorithms use one key, which works for both encryption and decryption. Usually,
the decryption algorithm is closely related to the encryption one. (For example, the Caesar cipher
with a shift of 3 uses the encryption algorithm "substitute the character three letters later in the
alphabet" with the decryption "substitute the character three letters earlier in the alphabet.")
The symmetric systems provide a two-way channel to their users: A and B share a secret key,
and they can both encrypt information to send to the other as well as decrypt information from
the other. As long as the key remains secret, the system also provides authentication proof that a
message received was not fabricated by someone other than the declared sender. Authenticity is
ensured because only the legitimate sender can produce a message that will decrypt properly
with the shared key.
The symmetry of this situation is a major advantage of this type of encryption, but it also leads to
a problem: key distribution. How do A and B obtain their shared secret key? And only A and B
can use that key for their encrypted communications. If A wants to share encrypted
communication with another user C, A and C need a different shared key. Key distribution is the
major difficulty in using symmetric encryption. In general, n users who want to communicate in
pairs need n * (n - 1)/2 keys. In other words, the number of keys needed increases at a rate
proportional to the square of the number of users! So a property of symmetric encryption
systems is that they require a means of key distribution.
Public key systems, on the other hand, excel at key management. By the nature of the public key
approach, you can send a public key in an e-mail message or post it in a public directory. Only
the corresponding private key, which presumably is kept private, can decrypt what has been
encrypted with the public key.
But for both kinds of encryption, a key must be kept well secured. Once the symmetric or private
key is known by an outsider, all messages written previously or in the future can be decrypted
(and hence read or modified) by the outsider. So, for all encryption algorithms, key
management is a major issue. It involves storing, safeguarding, and activating keys.
Most of the ciphers we have presented so far are stream ciphers; that is, they convert one
symbol of plaintext immediately into a symbol of ciphertext. (The exception is the columnar
transposition cipher.) The transformation depends only on the symbol, the key, and the control
information of the encipherment algorithm. A model of stream enciphering is shown in Figure 2-
6.
Some kinds of errors, such as skipping a character in the key during encryption, affect the
encryption of all future characters. However, such errors can sometimes be recognized during
decryption because the plaintext will be properly recovered up to a point, and then all following
characters will be wrong. If that is the case, the receiver may be able to recover from the error by
dropping a character of the key on the receiving end. Once the receiver has successfully
recalibrated the key with the ciphertext, there will be no further effects from this error.
To address this problem and make it harder for a cryptanalyst to break the code, we can use
block ciphers. A block cipher encrypts a group of plaintext symbols as one block. The columnar
transposition and other transpositions are examples of block ciphers. In the columnar
transposition, the entire message is translated as one block. The block size need not have any
particular relationship to the size of a character. Block ciphers work on blocks of plaintext and
produce blocks of ciphertext, as shown Figure 2-7. In the figure, the central box represents an
encryption machine: The previous plaintext pair is converted to po, the current one being
converted is IH, and the machine is soon to convert ES.
The Data Encryption Standard (DES) [NBS77], a system developed for the U.S. government,
was intended for use by the general public. It has been officially accepted as a cryptographic
standard both in the United States and abroad. Moreover, many hardware and software systems
have been designed with the DES. However, recently its adequacy has been questioned.
In the early 1970s, the U.S. National Bureau of Standards (NBS) recognized that the general
public needed a secure encryption technique for protecting sensitive information. Historically,
the U.S. Department of Defense and the Department of State had had continuing interest in
encryption systems; it was thought that these departments were home to the greatest expertise in
cryptology. However, precisely because of the sensitive nature of the information they were
encrypting, the departments could not release any of their work. Thus, the responsibility for a
more public encryption technique was delegated to the NBS.
At the same time, several private vendors had developed encryption devices, using either
mechanical means or programs that individuals or firms could buy to protect their sensitive
communications. The difficulty with this commercial proliferation of encryption techniques was
exchange: Two users with different devices could not exchange encrypted information.
Furthermore, there was no independent body capable of testing the devices extensively to verify
that they properly implemented their algorithms.
It soon became clear that encryption was ripe for assessment and standardization, to promote the
ability of unrelated parties to exchange encrypted information and to provide a single encryption
system that could be rigorously tested and publicly certified. As a result, in 1972 the NBS issued
a call for proposals for producing a public encryption algorithm. The call specified desirable
criteria for such an algorithm:
The NBS envisioned providing the encryption as a separate hardware device. To allow the
algorithm to be public, NBS hoped to reveal the algorithm itself, basing the security of the
system on the keys (which would be under the control of the users).
Few organizations responded to the call, so the NBS issued a second announcement in August
1974. The most promising suggestion was the Lucifer algorithm on which IBM had been
working for several years. This idea had been published earlier, so the basic algorithm was
already public and had been open to scrutiny and validation. Although lengthy, the algorithm
was straightforward, a natural candidate for iterative implementation in a computer program.
Furthermore, unlike the MerkleHellman (which we study in Chapter 12) and RSA algorithms,
which use arithmetic on 500- or 1,000- digit or longer binary numbers (far larger than most
machine instructions would handle as a single quantity), Lucifer used only simple logical
operations on relatively small quantities. Thus, the algorithm could be implemented fairly
efficiently in either hardware or software on conventional computers.
The data encryption algorithm developed by IBM for NBS was based on Lucifer, and it became
known as the Data Encryption Standard, although its proper name is DEA (Data Encryption
Algorithm) in the United States and DEA1 (Data Encryption Algorithm-1) in other countries.
Then, NBS called on the Department of Defense through its National Security Agency (NSA) to
analyze the strength of the encryption algorithm. Finally, the NBS released the algorithm for
public scrutiny and discussion.
The DES was officially adopted as a U.S. federal standard in November 1976, authorized by
NBS for use on all public and private sector unclassified communication. Eventually, DES was
accepted as an international standard by the International Standards Organization.
The DES algorithm is a careful and complex combination of two fundamental building blocks of
encryption: substitution and transposition. The algorithm derives its strength from repeated
application of these two techniques, one on top of the other, for a total of 16 cycles. The sheer
complexity of tracing a single bit through 16 iterations of substitutions and transpositions has so
far stopped researchers in the public from identifying more than a handful of general properties
of the algorithm.
The algorithm begins by encrypting the plaintext as blocks of 64 bits. The key is 64 bits long, but
in fact it can be any 56-bit number. (The extra 8 bits are often used as check digits and do not
affect encryption in normal implementations.) The user can change the key at will any time there
is uncertainty about the security of the old key.
The algorithm, described in substantial detail in Chapter 12, leverages the two techniques
Shannon identified to conceal information: confusion and diffusion. That is, the algorithm
accomplishes two things: ensuring that the output bits have no obvious relationship to the input
bits and spreading the effect of one plaintext bit to other bits in the ciphertext. Substitution
provides the confusion, and transposition provides the diffusion. In general, plaintext is affected
by a series of cycles of a substitution then a permutation. The iterative substitutions and
permutations are performed as outlined in Figure 2-8.
DES uses only standard arithmetic and logical operations on numbers up to 64 bits long, so it is
suitable for implementation in software on most current computers. Although complex, the
algorithm is repetitive, making it suitable for implementation on a single-purpose chip. In fact,
several such chips are available on the market for use as basic components in devices that use
DES encryption in an application.
As you know, computing power has increased rapidly over the last few decades, and it promises
to continue to do so. For this reason, the DES 56-bit key length is not long enough for some
people to feel comfortable. Since the 1970s, researchers and practitioners have been interested in
a longer-key version of DES. But we have a problem: The DES algorithm is fixed for a 56-bit
key.
Double DES
To address the discomfort, some researchers suggest using a double encryption for greater
secrecy. The double encryption works in the following way. Take two keys, k1 and k2, and
perform two encryptions, one on top of the other: E(k2, E(k1,m)). In theory, this approach should
multiply the difficulty of breaking the encryption, just as two locks are harder to pick than one.
Unfortunately, that assumption is false. Merkle and Hellman [MER81] showed that two
encryptions are no better than one. The basis of their argument is that the cryptanalyst works
plaintext and ciphertext toward each other. The analyst needs two pairs of plaintext (call them P1
and P2) and corresponding ciphertext, C1 and C2, but not the keys used to encrypt them. The
analyst computes and saves P1 encrypted under each possible key. The analyst then tries
decrypting C1 with a single key and looking for a match in the saved Ps. A match is a possible
pair of double keys, so the analyst checks the match with P2 and C2. Computing all the Ps takes
256 steps, but working backward from C1 takes only the same amount of time, for a total of 2 * 256
or 257, equivalent to a 57-bit key. Thus, the double encryption only doubles the work for the
attacker. As we soon see, some 56-bit DES keys have been derived in just days; two times days
is still days, when the hope was to get months if not years for the effort of the second encryption.
Triple DES
However, a simple trick does indeed enhance the security of DES. Using three keys adds
significant strength.
The so-called triple DES procedure is C = E(k3, E(k2, E(k1,m))). That is, you encrypt with one
key, decrypt with the second, and encrypt with a third. This process gives a strength equivalent
to a 112-bit key (because the double DES attack defeats the strength of one of the three keys).
A minor variation of triple DES, which some people also confusingly call triple DES, is C =
E(k1, D(k2, E(k1,m))). That is, you encrypt with one key, decrypt with the second, and encrypt
with the first again. This version requires only two keys. (The second decrypt step also makes
this process work for single encryptions with one key: The decryption cancels the first
encryption, so the net result is one encryption.) This approach is subject to another tricky attack,
so its strength is rated at only about 80 bits.
In summary, ordinary DES has a key space of 56 bits, double DES is scarcely better, but two-key
triple DES gives an effective length of 80 bits, and three-key triple DES gives a strength of 112
bits. Now, over three decades after the development of DES, a 56-bit key is inadequate for any
serious confidentiality, but 80- and 112-bit effective key sizes provide reasonable security.
Since its was first announced, DES has been controversial. Many researchers have questioned
the security it provides. Much of this controversy has appeared in the open literature, but certain
DES features have neither been revealed by the designers nor inferred by outside analysts.
In 1990, Biham and Shamir [BIH90] invented a technique, differential cryptanalysis, that
investigates the change in algorithmic strength when an encryption algorithm is changed in some
way. In 1991 they applied their technique to DES, showing that almost any change to the
algorithm weakens it. Their changes included cutting the number of iterations from 16 to 15,
changing the expansion or substitution rule, or altering the order of an iteration. In each case,
when they weakened the algorithm, Biham and Shamir could break the modified version. Thus,
it seems as if the design of DES is optimal.
However, Diffie and Hellman [DIF77] argued in 1977 that a 56-bit key is too short. In 1977, it
was prohibitive to test all 256 (approximately 1015) keys on then current computers. But they
argued that over time, computers would become more powerful and the DES algorithm would
remain unchanged; eventually, the speed of computers would exceed the strength of DES.
Exactly that has happened. In 1997 researchers using over 3,500 machines in parallel were able
to infer a DES key in four months' work. And in 1998 for approximately $100,000, researchers
built a special "DES cracker" machine that could find a DES key in approximately four days.
Does this mean that the DES is insecure? No, not yet. No one has yet shown serious flaws in the
DES. The 1997 attack required a great deal of cooperation, and the 1998 machine is rather
expensive. Triple DES is still well beyond the power of these attacks. Nevertheless, to anticipate
the increasing power of computers, it was clear a new, stronger algorithm was needed. In 1995,
the U.S. National Institute of Standards and Technology (NIST, the renamed NBS) began the
search for a new, strong encryption algorithm. The response to that search has become the
Advanced Encryption Standard, or AES.
The AES is likely to be the commercial-grade symmetric algorithm of choice for years, if not
decades. Let us look at it more closely.
In January 1997, NIST called for cryptographers to develop a new encryption system. As with
the call for candidates from which DES was selected, NIST made several important restrictions.
The algorithms had to be
unclassified
publicly disclosed
available royalty-free for use worldwide
symmetric block cipher algorithms, for blocks of 128 bits
usable with key sizes of 128, 192, and 256 bits
In August 1998, fifteen algorithms were chosen from among those submitted; in August 1999,
the field of candidates was narrowed to five finalists. The five then underwent extensive public
and private scrutiny. The final selection was made on the basis not only of security but also of
cost or efficiency of operation and ease of implementation in software. The winning algorithm,
submitted by two Dutch cryptographers, was Rijndael (pronounced RINE dahl or, to hear the
inventors pronounce it themselves, visit rijndael.com/audio/rijndael_pronunciation.wav); the
algorithm's name is derived from the creators' names, Vincent Rijmen and Joan Daemen. (NIST
described the four not chosen as also having adequate security for the AESno cryptographic
flaws were identified in any of the five. Thus, the selection was based on efficiency and
implementation characteristics.)
The AES was adopted for use by the U.S. government in December 2001 and became Federal
Information Processing Standard 197 [NIS01].
Overview of Rijndael
Rijndael is a fast algorithm that can be implemented easily on simple processors. Although it
has a strong mathematical foundation, it primarily uses substitution; transposition; and the shift,
exclusive OR, and addition operations. Like DES, AES uses repeat cycles. There are 10, 12, or
14 cycles for keys of 128, 192, and 256 bits, respectively. In Rijndael, the cycles are called
"rounds."
Byte substitution: This step uses a substitution box structure similar to the DES,
substituting each byte of a 128-bit block according to a substitution table. This is a
straight diffusion operation.
Shift row: A transposition step. For 128- and 192-bit block sizes, row n is shifted left
circular (n - 1) bytes; for 256-bit blocks, row 2 is shifted 1 byte and rows 3 and 4 are
shifted 3 and 4 bytes, respectively. This is a straight confusion operation.
Mix column: This step involves shifting left and exclusive-ORing bits with themselves.
These operations provide both confusion and diffusion.
Add subkey: Here, a portion of the key unique to this cycle is exclusive-ORed with the
cycle result. This operation provides confusion and incorporates the key.
These four steps are described in more detail in Chapter 12. Note that the steps perform both
diffusion and confusion on the input data. Bits from the key are combined with intermediate
result bits frequently, so key bits are also well diffused throughout the result. Furthermore, these
four steps are extremely fast. The AES algorithm is depicted in Figure 2-9.
Strength of the Algorithm
The Rijndael algorithm is quite new, so there are few reports of extensive experience with its
use. However, between its submission as a candidate for AES in 1997 and its selection in 2001, it
underwent extensive cryptanalysis by both government and independent cryptographers. Its
Dutch inventors have no relationship to the NSA or any other part of the U.S. government, so
there is no suspicion that the government somehow weakened the algorithm or added a trapdoor.
Although the steps of a cycle are simple to describe and seem to be rather random
transformations of bits, in fact (as described in some detail in Chapter 12), these transformations
have a sound mathematical origin.
Comparison of DES and AES
When Rijndael's predecessor, DES, was adopted, two questions quickly arose:
With nearly 30 years of use, suspicions of weakness (intentional or not) and backdoors have
pretty much been quashed. Not only have analysts failed to find any significant flaws, but in fact
research has shown that seemingly insignificant changes weaken the strength of the
algorithmthat is, the algorithm is the best it can be. The second question, about how long DES
would last, went unanswered for a long time but then was answered very quickly by two
experiments in which DES was cracked in days. Thus, after 20 years, the power of individual
specialized processors and of massive parallel searches finally overtook the fixed DES key size.
We must ask the same questions about AES: Does it have flaws, and for how long will it remain
sound? We cannot address the question of flaws yet, other than to say that teams of cryptanalysts
pored over the design of Rijndael during the two-year review period without finding any
problems.
The longevity question is more difficult, but also more optimistic, to answer for AES than for
DES. The AES algorithm as defined can use 128-, 192-, or 256-bit keys. This characteristic
means that AES starts with a key more than double the size of a DES key and can extend to
double it yet again. (Remember that doubling the key length squares the number of possible keys
that need to be tested in attempts to break the encryption.) But because there is an evident
underlying structure, it is also possible to use the same general approach on a slightly different
underlying problem and accommodate keys of even larger size. (Even a key size of 256 is
prodigious, however.) Thus, unlike DES, AES can move to a longer key length any time
technology seems to allow an analyst to overtake the current key size.
Moreover, the number of cycles can be extended in a natural way. With DES the algorithm was
defined for precisely 16 cycles; to extend that number would require substantial redefinition of
the algorithm. The internal structure of AES has no a priori limitation on the number of cycles. If
a cryptanalyst ever concluded that 10 or 12 or 14 rounds were too low, the only change needed to
improve the algorithm would be to change the limit on a repeat loop.
A mark of confidence is that the U.S. government has approved AES for protecting Secret and
Top Secret classified documents. This is the first time the United States has ever approved use of
a commercial algorithm derived outside the government (and furthermore outside the United
States) to encrypt classified data.
However, we cannot rest on our laurels. It is impossible to predict now what limitations
cryptanalysts might identify in the future. At present, AES seems to be a significant
improvement over DES, and it can be improved in a natural way if necessary.
2.7. Public Key Encryption
So far, we have looked at encryption algorithms from the point of view of making the scrambling
easy to do (so that the sender can easily encrypt a message) and the decryption easy for the
receiver but not for an intruder. But this functional view of transforming plaintext to ciphertext is
only part of the picture. We must also examine the role of keys in encryption. We have noted
how useful keys can be in deterring an intruder, but we have assumed that the key must remain
secret for it to be effective. In this section, we look at ways to allow the key to be public but still
protect the message. We also focus on the RSA algorithm, a public key system that is a popular
commercial-grade encryption technique.
In 1976, Diffie and Hellman [DIF76] proposed a new kind of encryption system. With a public
key[5] encryption system, each user has a key that does not have to be kept secret. Although
counterintuitive, in fact the public nature of the key does not compromise the secrecy of the
system. Instead, the basis for public key encryption is to allow the key to be divulged but to keep
the decryption technique secret. Public key cryptosystems accomplish this goal by using two
keys: one to encrypt and the other to decrypt.
[5]
Asymmetric or public key encryption systems use two keys, a public key and a private key.
Unfortunately, a few people call a symmetric or secret key system a "private key" system. This
terminology is confusing. We do not use it in this book, but you should be aware that you might
encounter the terminology in other readings.
Motivation
Why should making the key public be desirable? With a conventional symmetric key system,
each pair of users needs a separate key. But with public key systems, anyone using a single
public key can send a secret message to a user, and the message remains adequately protected
from being read by an interceptor. Let us investigate why this is so.
Recall that in general, an n-user system requires n * (n - 1)/2 keys, and each user must track and
remember a key for each other user with which he or she wants to communicate. As the number of
users grows, the number of keys increases very rapidly, as shown in Figure 2-10. Determining and
distributing these keys is a problem. More serious is maintaining security for the keys already
distributed, because we cannot expect users to memorize so many keys.
Characteristics
We can reduce the problem of key proliferation by using a public key approach. In a public key
or asymmetric encryption system, each user has two keys: a public key and a private key. The
user may publish the public key freely because each key does only half of the encryption and
decryption process. The keys operate as inverses, meaning that one key undoes the encryption
provided by the other key.
To see how, let kPRIV be a user's private key, and let kPUB be the corresponding public key. Then,
encrypted plaintext using the public key is decrypted by application of the private key; we write
the relationship as
That is, a user can decode with a private key what someone else has encrypted with the
corresponding public key. Furthermore, with some public key encryption algorithms, including
RSA, we have this relationship:
P = D(kPUB, E(kPRIV, P))
In other words, a user can encrypt a message with a private key, and the message can be revealed
only with the corresponding public key. These two properties tell us that public and private keys
can be applied in either order. In particular, the decryption function D can be applied to any
argument so that we can decrypt before we encrypt. With conventional encryption, we seldom
think of decrypting before encrypting. But the concept makes sense with public keys, where it
simply means applying the private transformation first and then the public one.
We have noted that a major problem with symmetric encryption is the sheer number of keys a
single user has to store and track. With public keys, only two keys are needed per user: one
public and one private. Let us see what difference this makes in the number of keys needed.
Suppose we have three users, B, C, and D, who must pass protected messages to user A as well
as to each other. Since each distinct pair of users needs a key, each user would need three
different keys; for instance, A would need a key for B, a key for C, and a key for D. But using
public key encryption, each of B, C, and D can encrypt messages for A by using A's public key.
If B has encrypted a message using A's public key, C cannot decrypt it, even if C knew it was
encrypted with A's public key. Applying A's public key twice, for example, would not decrypt
the message. (We assume, of course, that A's private key remains secret.) Thus, the number of
keys needed in the public key system is relatively small.
The characteristics of secret key and public key algorithms are compared in Table 2-5.
RivestShamirAdelman Encryption
Let us look at how the RSA encryption scheme works; we investigate it in greater detail in
Chapter 12. RSA relies on an area of mathematics known as number theory, in which
mathematicians study properties of numbers such as their prime factors. The RSA encryption
algorithm combines results from number theory with the degree of difficulty in determining the
prime factors of a given number. As do some of the other algorithms we have studied, the RSA
algorithm also operates with arithmetic mod n.
The two keys used in RSA, d and e, are used for decryption and encryption. They are actually
interchangeable: Either can be chosen as the public key, but one having been chosen, the other
one must be kept private. For simplicity, we call the encryption key e and the decryption key d.
Also, because of the nature of the RSA algorithm, the keys can be applied in either order:
P = E(D(P)) = D(E(P))
(You can think of E and D as two complementary functions, each of which "undoes" the other.)
Any plaintext block P is encrypted as Pe mod n. Because the exponentiation is performed mod n,
factoring Pe to uncover the encrypted plaintext is difficult. However, the decrypting key d is
carefully chosen so that (Pe)d mod n = P. Thus, the legitimate receiver who knows d simply
computes (Pe)d mod n = P and recovers P without having to factor Pe.
The encryption algorithm is based on the underlying problem of factoring large numbers. So far,
nobody has found a shortcut or easy way to factor large numbers in a finite set called a field. In a
highly technical but excellent paper, Boneh [BON99] reviews all the known cryptanalytic attacks
on RSA and concludes that none is significant. Because the factorization problem has been open
for many years, most cryptographers consider this problem a solid basis for a secure
cryptosystem.
Encryption algorithms alone are not the answer to everyone's encryption needs. Although
encryption implements protected communications channels, it can also be used for other duties.
In fact, combining symmetric and asymmetric encryption often capitalizes on the best features of
each.
Public key algorithms are useful only for specialized tasks because they are very slow. A public
key encryption can take 10,000 times as long to perform as a symmetric encryption because the
underlying modular exponentiation depends on multiplication and division, which are inherently
slower than the bit operations (addition, exclusive OR, substitution, and shifting) on which
symmetric algorithms are based. For this reason, symmetric encryption is the cryptographers'
"workhorse," and public key encryption is reserved for specialized, infrequent uses, where slow
operation is not a continuing problem.
Let us look more closely at four applications of encryption: cryptographic hash functions, key
exchange, digital signatures, and certificates.
Encryption is most commonly used for secrecy; we usually encrypt something so that its
contentsor even its existenceare unknown to all but a privileged audience. In some cases,
however, integrity is a more important concern than secrecy. For example, in a document
retrieval system containing legal records, it may be important to know that the copy retrieved is
exactly what was stored. Likewise, in a secure communications system, the need for the correct
transmission of messages may override secrecy concerns. Let us look at how encryption provides
integrity.
In most files, the elements or components of the file are not bound together in any way. That is,
each byte or bit or character is independent of every other one in the file. This lack of binding
means that changing one value affects the integrity of the file, but that one change can easily go
undetected.
What we would like to do is somehow put a seal or shield around the file so that we can detect
when the seal has been broken and thus know that something has been changed. This notion is
similar to the use of wax seals on letters in medieval days; if the wax was broken, the recipient
would know that someone had broken the seal and read the message inside. In the same way,
cryptography can be used to seal a file, encasing it so that any change becomes apparent. One
technique for providing the seal is to compute a cryptographic function, sometimes called a hash
or checksum or message digest of the file.
The hash function has special characteristics. For instance, some encryptions depend on a
function that is easy to understand but difficult to compute. For a simple example, consider the
cube function, y = x3. It is relatively easy to compute x3 by hand, with pencil and paper, or with a
calculator. But the inverse function, is much more difficult to compute. And the function
y = x2 has no inverse function since there are two possibilities for and -x. Functions
like these, which are much easier to compute than their inverses, are called one-way functions.
A one-way function can be useful in an encryption algorithm. The function must depend on all
bits of the file being sealed, so any change to even a single bit will alter the checksum result. The
checksum value is stored with the file. Then, each time the file is accessed or used, the checksum
is recomputed. If the computed checksum matches the stored value, it is likely that the file has
not been changed.
A cryptographic function, such as the DES or AES, is especially appropriate for sealing values,
since an outsider will not know the key and thus will not be able to modify the stored value to
match with data being modified. For low-threat applications, algorithms even simpler than DES
or AES can be used. In block encryption schemes, chaining means linking each block to the
previous block's value (and therefore to all previous blocks), for example, by using an exclusive
OR to combine the encrypted previous block with the encryption of the current one. A file's
cryptographic checksum could be the last block of the chained encryption of a file since that
block will depend on all other blocks.
As we see later in this chapter, these techniques address the nonalterability and nonreusability
required in a digital signature. A change or reuse will be flagged by the checksum, so the
recipient can tell that something is amiss.
The most widely used cryptographic hash functions are MD4, MD5 (where MD stands for
Message Digest), and SHA/SHS (Secure Hash Algorithm or Standard). The MD4/5 algorithms
were invented by Ron Rivest and RSA Laboratories. MD5 is an improved version of MD4. Both
condense a message of any size to a 128-bit digest. SHA/SHS is similar to both MD4 and MD5;
it produces a 160-bit digest.
Wang et al. [WAN05] have announced cryptanalysis attacks on SHA, MD4, and MD5. For SHA,
the attack is able to find two plaintexts that produce the same hash digest in approximately 2 63
steps, far short of the 280 steps that would be expected of a 160-bit hash function, and very
feasible for a moderately well-financed attacker. Although this attack does not mean SHA is
useless (the attacker must collect and analyze a large number of ciphertext samples), it does
suggest use of long digests and long keys. NIST [NIS05, NIS06] has studied the attack carefully
and recommended countermeasures.
Key Exchange
Suppose you need to send a protected message to someone you do not know and who does not
know you. This situation is more common than you may think. For instance, you may want to
send your income tax return to the government. You want the information to be protected, but
you do not necessarily know the person who is receiving the information. Similarly, you may
want to use your web browser to connect with a shopping web site, exchange private (encrypted)
e-mail, or arrange for two hosts to establish a protected channel. Each of these situations depends
on being able to exchange an encryption key in such a way that nobody else can intercept it. The
problem of two previously unknown parties exchanging cryptographic keys is both hard and
important. Indeed, the problem is almost circular: To establish an encrypted session, you need an
encrypted means to exchange keys.
Public key cryptography can help. Since asymmetric keys come in pairs, one half of the pair can
be exposed without compromising the other half. To see how, suppose S and R (our well-known
sender and receiver) want to derive a shared symmetric key. Suppose also that S and R both have
public keys for a common encryption algorithm; call these kPRIV-S, kPUB-S, kPRIV-R, and kPUB-R, for
the private and public keys for S and R, respectively. The simplest solution is for S to choose any
symmetric key K, and send E(kPRIV-S,K) to R. Then, R takes S's public key, removes the
encryption, and obtains K. Alas, any eavesdropper who can get S's public key can also obtain K.
Instead, let S send E(kPUB-R, K) to R. Then, only R can decrypt K. Unfortunately, R has no
assurance that K came from S.
We can think of this exchange in terms of lockboxes and keys. If S wants to send something
protected to R (such as a credit card number or a set of medical records), then the exchange
works something like this. S puts the protected information in a lockbox that can be opened only
with S's public key. Then, that lockbox is put inside a second lockbox that can be opened only
with R's private key. R can then use his private key to open the outer box (something only he can
do) and use S's public key to open the inner box (proving that the package came from S). In other
words, the protocol wraps the protected information in two packages: the first unwrapped only
with S's public key, and the second unwrapped only with R's private key. This approach is
illustrated in Figure 2-11.
Another approach not requiring pre-shared public keys is the so-called DiffieHellman key
exchange protocol. In this protocol, S and R use some simple arithmetic to exchange a secret.
They agree on a field size n and a starting number g; they can communicate these numbers in the
clear. Each thinks up a secret number, say, s and r. S sends to R gs and R sends to S gr. Then S
computes (gr)s and R computes (gs)r, which are the same, so grs = gsr becomes their shared secret.
(One technical detail has been omitted for simplicity: these computations are done over a field of
integers mod n; see Chapter 12 for more information on modular arithmetic.)
Digital Signatures
Another typical situation parallels a common human need: an order to transfer funds from one
person to another. In other words, we want to be able to send electronically the equivalent of a
computerized check. We understand how this transaction is handled in the conventional, paper
mode:
Transacting business by check depends on tangible objects in a prescribed form. But tangible
objects do not exist for transactions on computers. Therefore, authorizing payments by computer
requires a different model. Let us consider the requirements of such a situation, both from the
standpoint of a bank and from the standpoint of a user.
Suppose Sandy sends her bank a message authorizing it to transfer $100 to Tim. Sandy's bank
must be able to verify and prove that the message really came from Sandy if she should later
disavow sending the message. The bank also wants to know that the message is entirely Sandy's,
that it has not been altered along the way. On her part, Sandy wants to be certain that her bank
cannot forge such messages. Both parties want to be sure that the message is new, not a reuse of
a previous message, and that it has not been altered during transmission. Using electronic signals
instead of paper complicates this process.
But we have ways to make the process work. A digital signature is a protocol that produces the
same effect as a real signature: It is a mark that only the sender can make, but other people can
easily recognize as belonging to the sender. Just like a real signature, a digital signature is used
to confirm agreement to a message.
Properties
It must be unforgeable. If person P signs message M with signature S(P,M), it is impossible for
anyone else to produce the pair [M, S(P,M)].
It must be authentic. If a person R receives the pair [M, S(P,M)] purportedly from P, R can check
that the signature is really from P. Only P could have created this signature, and the signature is
firmly attached to M.
These two requirements, shown in Figure 2-12, are the major hurdles in computer transactions.
Two more properties, also drawn from parallels with the paper-based environment, are desirable
for transactions completed with the aid of digital signatures:
To see how digital signatures work, we first present a mechanism that meets the first two
requirements. We then add to that solution to satisfy the other requirements.
Public key encryption systems are ideally suited to digital signatures. For simple notation, let us
assume that the public key encryption for user U is accessed through E(M, KU) and that the
private key transformation for U is written as D(M,KU). We can think of E as the privacy
transformation (since only U can decrypt it) and D as the authenticity transformation (since only
U can produce it). Remember, however, that under some asymmetric algorithms such as RSA, D
and E are commutative, and either one can be applied to any message. Thus,
D(E(M, ), ) = M = E(D(M, ), )
If S wishes to send M to R, S uses the authenticity transformation to produce D(M, KS). S then
sends D(M, KS) to R. R decodes the message with the public key transformation of S, computing
E(D(M,KS), KS) = M. Since only S can create a message that makes sense under E(,KS), the
message must genuinely have come from S. This test satisfies the authenticity requirement.
R will save D(M,KS). If S should later allege that the message is a forgery (not really from S), R
can simply show M and D(M,KS). Anyone can verify that since D(M,KS) is transformed to M
with the public key transformation of Sbut only S could have produced D(M,KS)then D(M,KS)
must be from S. This test satisfies the unforgeable requirement.
There are other approaches to implementing digital signature; some use symmetric encryption,
others use asymmetric. The approach shown here illustrates how the protocol can address the
requirements for unforgeability and authenticity. To add secrecy, S applies E(M, KR) as shown in
Figure 2-13.
Next, we learn about cryptographic certificates so that we can see how they are used to address
authenticity.
Certificates
As humans we establish trust all the time in our daily interactions with people. We identify
people we know by recognizing their voices, faces, or handwriting. At other times, we use an
affiliation to convey trust. For instance, if a stranger telephones us and we hear, "I represent the
local government…" or "I am calling on behalf of this charity…" or "I am calling from the
school/hospital/police about your mother/father/son/daughter/brother/sister…," we may decide to
trust the caller even if we do not know him or her. Depending on the nature of the call, we may
decide to believe the caller's affiliation or to seek independent verification. For example, we may
obtain the affiliation's number from the telephone directory and call the party back. Or we may
seek additional information from the caller, such as "What color jacket was she wearing?" or
"Who is the president of your organization?" If we have a low degree of trust, we may even act
to exclude an outsider, as in "I will mail a check directly to your charity rather than give you my
credit card number."
For each of these interactions, we have what we might call a "trust threshold," a degree to which
we are willing to believe an unidentified individual. This threshold exists in commercial
interactions, too. When Acorn Manufacturing Company sends Big Steel Company an order for
10,000 sheets of steel, to be shipped within a week and paid for within ten days, trust abounds.
The order is printed on an Acorn form, signed by someone identified as Helene Smudge,
Purchasing Agent. Big Steel may begin preparing the steel even before receiving money from
Acorn. Big Steel may check Acorn's credit rating to decide whether to ship the order without
payment first. If suspicious, Big Steel might telephone Acorn and ask to speak to Ms. Smudge in
the purchasing department. But more likely Big Steel will actually ship the goods without
knowing who Ms. Smudge is, whether she is actually the purchasing agent, whether she is
authorized to commit to an order of that size, or even whether the signature is actually hers.
Sometimes a transaction like this occurs by fax, so that Big Steel does not even have an original
signature on file. In cases like this one, which occur daily, trust is based on appearance of
authenticity (such as a printed, signed form), outside information (such as a credit report), and
urgency (Acorn's request that the steel be shipped quickly).
For electronic communication to succeed, we must develop similar ways for two parties to
establish trust without having met. A common thread in our personal and business interactions is
the ability to have someone or something vouch for the existence and integrity of one or both
parties. The police, the Chamber of Commerce, or the Better Business Bureau vouches for the
authenticity of a caller. Acorn indirectly vouches for the fact that Ms. Smudge is its purchasing
agent by transferring the call to her in the purchasing department. In a sense, the telephone
company vouches for the authenticity of a party by listing it in the directory. This concept of
"vouching for" by a third party can be a basis for trust in commercial settings where two parties
do not know each other.