CS431 Computer and Network Security
Digital Signatures
Abhishek Bichhawat 12/03/2024
Story So Far
● Require messages to be transmitted confidentially over channels
such that the integrity, authenticity of the data is not
compromised
Symmetric-key Asymmetric-key
Confidentiality ● One-time pads ● Diffie-Hellman-Merkle
● Block ciphers with chaining ● RSA encryption
modes (e.g. AES-CBC) ● ElGamal encryption
Integrity,
Authentication
2
Story So Far
● Require messages to be transmitted confidentially over channels
such that the integrity, authenticity of the data is not
compromised
Symmetric-key Asymmetric-key
Confidentiality ● One-time pads ● Diffie-Hellman-Merkle
● Block ciphers with chaining ● RSA encryption
modes (e.g. AES-CBC) ● ElGamal encryption
Integrity, ● MACs (e.g. HMAC) ● Digital signatures
Authentication (e.g. RSA signatures)
3
Digital Signatures
● Digital signatures are a way to provide integrity/authenticity
● A digital signature scheme is a triple〈G, S, V〉of efficiently computable
algorithms
○ G outputs a “public key” VK and a “private key” SK:
〈VK, SK〉← G(⋅)
○ S takes a “message” m and SK as input and outputs a “signature” σ:
σ ← SSK(m)
○ V takes a “message” m, signature σ and public key VK as input, and outputs a bit
b:
b ← VVK(m, σ)
○ If σ ← SSK(m) then VVK(m, σ) outputs 1 (“valid”)
○ Given only VK and message/signature pairs {〈mi, SSK(mi)〉}i, it is computationally
infeasible to compute 〈m, σ〉such that VVK(m, σ) = 1 for any new m ≠ mi
4
Digital Signatures
YES!
Certificate Authority
Is this Alice’s
signature? 5
Digital Signatures
D E
M S S,M
6
RSA Signatures
● Key generation (same as in RSA PKE):
○ Choose two large prime numbers p and q such that p ≠ q.
○ Pick integer e coprime with (p-1)(q-1) (i.e., gcd(e, (p-1)(q-1)) = 1)
○ Compute d such that ed mod (p-1)(q-1) = 1
○ Private key = (n, d)
○ Public key = (n, e)
● Sign (d, m):
○ Compute sig = md mod pq
● Verify(e, n, m, sig):
○ Check m ≡ sige mod pq
7
Digital Signatures - Compromises
● Existential forgery
○ The attacker manages to forge a signature of (at least) one
message, but not necessarily of his choice
● Selective forgery
○ The attacker manages to forge a signature of (at least) one
message of his choice
● Universal forgery
○ The attacker manages to forge a signature of any message
8
Summary
● Public key encryption
○ Private and public key pairs
○ Public key encrypts and private key decrypts
○ Provides properties similar to symmetric key encryption
● Digital signatures
○ Provide integrity and authenticity for asymmetric schemes
○ Private and public key pairs
○ Encrypt with private key and decrypt with public key
9
Cryptographic Hash
● Function that applied to a message generates a fixed-length
“fingerprint” of the message
○ H(m) = {0, 1}n
○ Takes a variable sized input and produces fixed-sized output
○ Simplest hash function consists of starting with the first n-bit
block, XORing it bit-by-bit with the second n-bit block, XORing the
result with the next n-bit block, and so on.
10
Cryptographic Hash
● Function that applied to a message generates a fixed-length
“fingerprint” of the message
○ H(m) = {0, 1}n
○ Takes a variable sized input and produces fixed-sized output
○ Output is referred to as hash value or message digest
○ Any changes in the message, changes the hash too
Message: "The quick brown fox jumps over the lazy dog"
SHA1 hashcode: 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12
Message: "The quick brown fox jumps over the lazydog"
SHA1 hashcode: 8de49570b9d941fb26045fa1f5595005eb5f3cf2
11
Cryptographic Hash
● Function that applied to a message generates a fixed-length
“fingerprint” of the message
○ H(m) = {0, 1}n
○ Any changes in the message, changes the hash too
● Deterministic one-way function
12
One-way Functions
● Is H(X) = 1 a one-way function?
● Is H(s) = “Take first letter of each word in a sentence s” a
one-way function?
13
One-way Functions
● Is H(X) = 1 a one-way function?
● Is H(s) = “Take first letter of each word in a sentence s” a
one-way function?
● Is
H( )=
14
Cryptographic Hash
● Function that applied to a message generates a fixed-length
“fingerprint” of the message
○ H(m) = {0, 1}n
○ Any changes in the message, changes the hash too
● Deterministic one-way function
○ Given x, it is easy to compute H(x).
○ However, given a hash output y, it is infeasible to find any input x
such that H(x) = y
15
Cryptographic Hash
● Function that applied to a message generates a fixed-length
“fingerprint” of the message
○ H(m) = {0, 1}n
○ Any changes in the message, changes the hash too
● Deterministic one-way function
● Collision-resistant
○ For any x and x’, it is infeasible to find H(x) = H(x’) if x ≠ x’
○ Weak : Given x, it is infeasible to find H(x) = H(x’) for x’ ≠ x
● Hash functions can be used to verify message integrity
16
Hashing Algorithms
● MD4, SHA-0 (insecure; have practical attacks)
● MD5 (128 bits; insecure; have practical attacks)
● SHA-1 (160 bits; insecure; proposed attack not practical yet)
● SHA-2
○ Similar construction as SHA1
○ Includes SHA-256, SHA-384, SHA-512
● SHA-3 (Keccak)
○ Different from SHA1, SHA2
○ SHA3-256, SHA3-512
17
Collision
● Is H(X) = 1 collision-resistant?
● Is H(s) = “Take first letter of each word in a sentence s”
collision-resistant?
● Is
H( )= ?
18
Collision Resistance
How large should N be to have a reasonable
chance (p>0.5) to find a collision?
19
Collision Resistance - Birthday Problem
How many people should be there in a room
so that there is a reasonable chance (p > 0.5)
of two persons sharing the same birthday?
=365
20
Collision Resistance - Birthday Problem
p(n) → probability that any two people share birthday (Strong)
q(n) → probability that someone shares a birthday with me (Weak)
21
Collision Resistance - Birthday Attack
● Finding a collision on an n-bit output requires only 2n/2 tries on
average
● The aim of the attacker is to compromise the data integrity of a
message sent by the victim
● Knows that victim uses hash function with n-bit digest
● Create a large number of variations (2n/2) of message
● Find a message with same hash as the original message
22
Merkle-Damgård Construction
If the compression function is collision-resistant and “appropriate”
padding is used, then the hash function will also be collision-resistant
RESULT
23
Hashes and Integrity
● If the attacker cannot modify the hash, then we have integrity
● If the attacker can modify the hash:
○ Along with the message, we have no integrity
● Hashes are computed using public information only
○ Anyone can compute hash of a message
● Add secret keys to compute hash
24
MD5
● Designed by Ron Rivest in 1992
● Transforms arbitrary length input into 128-bit output
● MD5 is improved version of MD4
● Used as a basis for SHA-1 (160 bits)
● MD5 has been shown to be vulnerable to collisions
○ (SHA-1 too has been shown to be broken…)
● Based on the Merkle-Damgård construction
25
MD5
26
MD5 Padding and Sequencing
● Take original message M
● len = Length of M
● Append “10000…” to M to obtain message M’ with length (512x
- 64) bits
● Append len mod 264 to M’ to obtain M’’
● M’’ is of size 512x, chop it up in pieces of 512 bits
● Process each 512-bit block at a time
27
MD5 Process
● Each block Bi contains sixteen 32-bit words (512 bits)
● MD5 digest = four 32-bit words = 128 bit
● IV=initialization vector
○ A0 = 01234567
○ B0 = 89ABCDEF
○ C0 = FEDCBA98
○ D0 = 76543210
● Every stage consists of four rounds over the message block
28
MD5 Stage i CVi-1
Bi
CVi 29
MD5 Stage i
● F(b,c,d) = if b then c else d
● G(b,c,d) = if d then b else c
● H(b,c,d) = b ⊕ c ⊕ d
● I(b,c,d) = c ⊕ (b & !d)
● T[i] = int (232abs(sin(i))), 0 < i ≤ 64
● One of the 16 32-bit words of Bi is used at each step, so that Bi
is used in each round
○ perm Bi defines the order in which each of the 16 words of Bi are
used
30
MD5
● Reduces arbitrary length text to a 128-bit output
● Uses same principle as cipher block chaining
○ Succession of stages combining input and previous result
○ Basic principles of chaining, compression
● Extremely complicated in a given stage
○ Tries to mix bits as much as possible and avoid collisions
○ Unfortunately, broken in 2006: two message texts with the same
hash value can be found in about an hour on an IBM P960 (super
multiprocessor, but still)
○ Wang and Yu found a way to find M,M’ and N,N’ such that
f(f(s,M),M’) = f(f(s,N),N’) for any MD5 state s
31
Length Extension Attack
● Given H(x) and the length of x, but not x, an attacker can create
H(x || m) for any m of the attacker’s choosing
● Hash algorithms based on Merkle-Damgard construction are
vulnerable to this attack
● Message digest from F can be used in the next stage to get the
new hash without knowing x
32
Cryptographic Hash and Integrity
● If the attacker cannot modify the hash, then we have integrity
● If the attacker can modify the hash:
○ Along with the message, we have no integrity
● Hashes are computed using public information only
○ Anyone can compute hash of a message
● Add secret keys to compute hash
33
Message Authentication Codes (MAC)
● “Cryptographic checksum,” i.e., keyed hash
● Can use symmetric block cipher or (more commonly) one-way
hash function as a basis
● Provides authentication and integrity
○ Send message and tag (MAC)
34
Message Authentication Codes
A message authentication code (MAC) scheme is a triple〈G, T, V〉 of
efficiently computable functions
● G outputs a “secret key” K
K ← G(⋅)
● T takes a key K and “message” m as input, and outputs a “tag” t
t ← TK(m)
● V takes a message m, tag t and key K as input, and outputs a bit b
b ← VK(m, t)
● If t ← TK(m) then VK(m, t) outputs 1 (“valid”)
● Given only message/tag pairs {〈mi, TK(mi)〉}i, it is computationally
infeasible to compute 〈m, t〉 such that VK(m, t) = 1 for any new m
≠ mi
35
MAC
● Alice wants to send m to Bob and guarantee integrity
● Alice sends m and t = MAC(K, m) to Bob
● Bob receives m and t
● Bob computes MAC(K, m) and checks that it matches t
● If the MACs match, Bob is confident the message has not been
tampered with.
36
Hash Message Authentication Code (HMAC)
● Hash-based MAC
○ Cryptographic hash
○ Secret key
● Provide authentication using a shared secret instead of using
digital signatures
● Normally, we could have done MAC = H(Key || Message)
but this is susceptible to length extension attacks
● Instead, we do MAC = H(Key || H(Key || Message))
37
HMAC
38
Randomness
● Randomness is an important property required by
cryptographic algorithms
○ Used to generate nonces, IVs, keys etc.
● If the attacker can derive the random number, then no
guarantees remain
● We want to generate random numbers, securely
● Entropy
○ Measure of unpredictability of outputs (more is better)
○ Tossing a fair coin has a high unpredictability because both the
outcomes are equally possible
○ In general, uniform distribution has highest entropy
39
Randomness
● True randomness requires some physical source of entropy
○ Heat or light intensity
○ Human activity - moving the cursor, pressing keys (check PGP)
● Multiple sources of randomness may be better
● May be biased, at times
● Is also expensive
40
Pseudorandom number generators (PRNGs)
● Uses some randomness to generate random numbers
● They are deterministic
○ Algorithm is set and with the same seed value it generates the
random numbers in the same order
● But may be random for an adversary who does not know the
algorithm
● Input : Some random value
● Output : Pseudorandom numbers
41
Insecure PRNGs
● Casinos in Missouri experienced unusual activities
○ Suspicious players would hover over the lever and then spin at a
specific time to win
○ Vulnerability: Slot machines used predictable PRNGs
○ The PRNG output was based on the current time
● OpenSSL vulnerability
○ OpenSSL is a cryptographic library
○ Used process-ids to seed the PRNG
○ Normally has only 32,768 possible numbers (in 32-bit machines)
42
Authenticated Encryption
● Scheme guaranteeing both confidentiality and integrity
● Either:
○ Combine two schemes providing each
○ Use a scheme providing both
43
Authenticated Encryption
● Scheme guaranteeing both confidentiality and integrity
● Combine two schemes providing each
○ Suppose a message M needs to be sent:
EncK1(M) and MACK2(M)
■ Provides integrity but no confidentiality because of the MAC
which is deterministic and is not Ind-CPA
■ MACs in general may leak information about the message
44
Authenticated Encryption
● Scheme guaranteeing both confidentiality and integrity
● Combine two schemes providing each
○ Suppose a message M needs to be sent:
EncK1(M) and MACK2(EncK1(M))
■ Provides integrity
■ Provides confidentiality - MACs may leak information about
the ciphertext
45
MAC-then-Encrypt
● First compute MACK2(M)
● Then compute EncK1(M || MACK2(M))
● Shortcoming
○ Attacker can supply arbitrary tampered input, and you always have to
decrypt it
○ Passing attacker-chosen input through the decryption function can cause
side-channel leaks
○ Cause of “Lucky 13” attack in TLS 1.0
■ Use timing side-channel
■ Nice explanation here:
https://medium.com/@c0D3M/lucky-13-attack-explained-dd9a9fd
42fa6 46
Encrypt-then-MAC
● First compute EncK1(M)
● Then compute MACK2(EncK1(M))
○ Is better than MAC-then-encrypt
47
Authenticated Encryption
● Scheme guaranteeing both confidentiality and integrity
● Combine two schemes providing each
○ Suppose a message M needs to be sent:
EncK1(M || MACK2(M))
■ Provides integrity
■ Provides confidentiality
48
Authenticated Encryption with Additional Data
● Scheme guaranteeing both confidentiality and integrity
● Use scheme designed for both:
○ AEAD
■ Provides both confidentiality and integrity over plaintext
■ Provides integrity over the additional (associated) data
49
Public Key Infrastructure
● Remember the problem of sharing public keys and the
man-in-the-middle attack
● Need an infrastructure to verify the authenticity of public keys
E D
M C M
50
Public Key Infrastructure
● Certificate Authority
○ Issues certificates
■ Endorses the public key of a participant
■ Binds the participant’s name to the public key
● Trust Anchor
○ Entity for which trust is assumed and not derived
○ Trust others’ data through the anchor
○ Identify a certificate authority using a root certificate
○ Stores certificates for all the participants that trust the anchor
○ Establish a chain of trust
■ Verify the sender and intermediate certificate issuers
51
Chain of Trust Charlie
{KC}KA’
Alice
{KA}KCA’
DeeDee
{KD}KA’ Elsa
{KE}KD’
Trust anchor
{KCA}KCA’ Bob
{KB}KCA’
52
Trust Anchors
● Who can be a trust anchor?
○ Central universally trusted CA
■ Contains certificates for all users
■ CA’s public key known to everyone
■ Trust on CA implies trust on the signed public key received
■ We trust that the CA has verified the identity of the user
whose public key we are asking for
53
Trust Anchors
● Who can be a trust anchor?
○ Central universally trusted CA
■ Contains certificates for all users
■ CA’s public key known to everyone
■ Trust on CA implies trust on the signed public key received
■ We trust that the CA has verified the identity of the user
whose public key we are asking for
○ Problems with central trusted CA
■ Single point of failure
■ Scalability
54
Trust Anchors
● Who can be a trust anchor?
○ Central universally trusted CA with RA
■ RAs are intermediate regional authorities that can verify keys
and can sign keys if the CA delegates this authority
■ CA issues certificates to the RAs
■ Creates a certificate chain
55
Trust Anchors
● Who can be a trust anchor?
○ Multiple trusted CAs with RAs
■ Addresses scalability too
■ Some of them are added to the softwares when shipping
56
Web-of-Trust
57
Web-of-Trust
58
Web-of-Trust
● Challenges
○ Trust is not transitive
■ Just because Alice trusts Bob and Bob trusts Charlie,
it does not mean that Alice trusts Charlie
○ Trust is not absolute
■ You may trust someone for specific tasks but not other tasks
■ Some security expert:
“I trust my bank with my money but not with my children;
I trust my relatives with my children but not with my money.”
59
Stored Certificates
60
Certificate Revocation
● Certificate revocation is a mechanism to invalidate certificates
○ After a private key is disclosed
○ Employee leaves corporation
○ Certificate expired
■ Expiration time is usually chosen too long (updating
certificates is a lot of work)
■ Mitigates damage
○ Implementation flaws
■ ACME implementation Boulder (used, e.g., by Let’s Encrypt)
had a flaw that allowed an attacker to obtain certificates for
domains it does not own. As a consequence of this flaw, Let’s
Encrypt had to revoke more than 3 million certificates
○ Expensive process
61
Building a Cryptosystem
Requires understanding possible interactions between the
components
● Example
○ Pretty good privacy (Phil Zimmerman)
○ Used to digitally sign and/or encrypt email
62
Pretty Good Privacy
63
Passwords
64
Passwords
● Combination of characters to authenticate a user
○ When you create an account with a service: Create a password
○ When you later want to log in to the service: Type in the same
password again
● How does the service check that your password is correct?
65
Passwords
● Combination of characters to authenticate a user
○ When you create an account with a service: Create a password
○ When you later want to log in to the service: Type in the same
password again
● How does the service check that your password is correct?
○ Store a file listing every user’s password
66
Passwords
● Combination of characters to authenticate a user
○ When you create an account with a service: Create a password
○ When you later want to log in to the service: Type in the same
password again
● How does the service check that your password is correct?
○ Store a file listing every user’s password
○ Problem: What if an attacker hacks into the service? Now the
attacker knows everyone’s passwords!
67
Passwords
● Combination of characters to authenticate a user
○ When you create an account with a service: Create a password
○ When you later want to log in to the service: Type in the same
password again
● How does the service check that your password is correct?
○ Encrypt every user’s password before storing it
68
Passwords
● Combination of characters to authenticate a user
○ When you create an account with a service: Create a password
○ When you later want to log in to the service: Type in the same
password again
● How does the service check that your password is correct?
○ Encrypt every user’s password before storing it
○ Where’s the key stored?
○ Problem: The attacker could steal the encrypted passwords and
the key and decrypt everyone’s passwords!
69
Password Hashes
● Store a hash of the password
○ Hashes are
■ Deterministic: To verify a password, it has to hash to the same
value every time
■ One-way: We don’t want the attacker to reverse hashes into
original passwords
● Password verification
○ Hash the password provided by the user
○ See if the hash matches the password hash in the file
70
Attacks on Hashes
● Adversary can get access to the file containing password hashes
○ Hashes are deterministic so passwords that are common across
users will have the same hash value
○ Attacker can compute hash values for common passwords
■ Most people use passwords that are very common
■ Dictionary attack - Hash an entire directory of passwords
■ Rainbow tables - Makes brute-force easier
71
Offline Attacks on Password Hashes
● Adversary steals the
password file, and then
computes hashes herself
to check for matches.
● Adversary can try a huge
number of passwords
● If an attacker can do an
offline attack, you need a
really strong password
(e.g. 7 or more random
words)
72
https://xkcd.com/936/
Solutions for Offline Attacks on Password Hashes
● Add salt
○ Unique random number per user
○ For each user, store: username, salt,
H(password || salt)
○ To verify a password, look up the user’s
salt in the file, compute H(password || salt),
and check it matches the hash in the file
○ Salts are not secret
73
Solutions for Offline Attacks on Password Hashes
● Slow Hash
○ Use a hash function that computes hashes slowly
○ Legitimate users will not notice if it takes 0.0001 seconds or 0.1 seconds
for the server to check a password.
○ However, adversaries need to compute millions of hashes; using a slow
hash can slow the brute-force attack making them impractical
74
Online Attacks on Password Hashes
● Adversary interacts with the service (e.g., tries to log in to a website)
by trying every different password.
○ Start with common passwords like 123456, password, qwerty etc.
○ With a dictionary of the 10 most common passwords, you can expect to
find about 1% of users’ passwords
● The adversary doesn’t compute the hashes.
● The tries of the attacker are limited
75
Solution to Online Attacks on Passwords
● Rate-limiting
○ Limit the number of tries within a time limit
○ Lock accounts if a certain number of tries fail
○ May result in DoS
● Impose password requirements
○ Make it harder to guess password
● CAPTCHAs
○ Make it longer for adversary to complete a guess
○ Can remove the possibility of automated checks
● These may not help against untargeted attacks
76
Choosing Passwords
https://xkcd.com/538/ 77