Data Integrity
By: Dr. Ramzi Saifan
Encryption/Decryption
• Provides message confidentiality.
• Does it provide message authentication?
2
Message Authentication
• Bob receives a message m from Alice, he wants to know
(Data origin authentication) whether the message was
really sent by Alice;
(Data integrity) whether the message has been modified.
• Solutions:
Alice attaches a message authentication code (MAC)
to the message.
Or she attaches a digital signature to the message.
3
Communication without authentication
Very easy..
Eve Eve can simply
change the
message
M M’
Alice Bob
Integrity with Hash
Forge M’ and
compute h(M’)
Eve
No shared
key
M M’
h (M) h (M)
Alice Bob
Can we simply send the hash with the message to serve message authentication ?
Ans: No, Eve can change the message and recompute the hash.
Using hash needs more appropriate procedure to guarantee integrity
Integrity Protection with MAC
k=??,
MAC=??
Eve Eve can not
forge MAC
when k is
M M’ unknown
MAC (k,M) MAC??
Alice Bob
Key : k Key : k
Shared key k to generate authenticate message
Message Authentication Code
➢A function of the message and a secret key that produces a fixed-length
value that serves as the authenticator
➢Generated by an algorithm :
➢ generated from message + secret key : MAC = F(K,M)
➢ A small fixed-sized block of data
➢ appended to message as a signature when sent
➢Receiver performs same computation on message and checks it matches
the MAC
Keyed Hash Functions as MACs
➢Want a MAC based on a hash function
⚫because hash functions are generally faster
⚫crypto hash function code is widely available
⚫But hashing is internally has no key!
➢Original proposal:
KeyedHash = Hash(Key|Message)
⚫some weaknesses were found with this
➢Eventually led to development of HMAC
To be useful for message authentication, a hash function H
must have the following properties:
Can be applied to a block of data of any size
Produces a fixed-length output
H(x) is relatively easy to compute for any given x
One-way or pre-image resistant
• Computationally infeasible to find x such that H(x) = h
Computationally infeasible to find y ≠ x such that H(y) = H(x)
Collision resistant or strong collision resistance
• Computationally infeasible to find any pair (x,y) such that H(x) = H(y)
Birthday Problem
• Birthday problem: In a group of k people, what is the probability
that at least two people have the same birthday?
Having the same birthday is a collision?
• Birthday paradox: p 1 2 with k as small as 23.
• Consider a hash function h : {0,1} → {0,1}n .
• If we randomly generate k messages, the probability
of having a collision depends on n.
• To resist birthday attack, we choose n to be sufficiently large that
it will take an infeasibly large k to have a non-negligible probability
of collision.
Collision-resistant hash functions
• Collision-resistant hash functions can be built from collision-resistant
compression functions using Merkle-Damgard construction.
m1 m2 m3 mk
v0 v1 v2 vk
IV f f f … f h(m)
Compression function f :{0,1}n + b → {0,1}n
17
• Insecure: MACk ( m) = h( m) with IV = k .
(For simplicity, without padding)
m = m1 m2 m3 ms
f … f X hk(m)
h(m)
X
k IV f f
ms+1
• Easy to forge:
(m, hk ( m)), f
hk(m) hk(m||ms+1)
where m = m ms +1
HMAC
• Interest in developing a MAC derived from a cryptographic hash code
• Cryptographic hash functions generally execute faster
• Library code is widely available
• SHA-1 was not deigned for use as a MAC because it does not rely on a secret key
• Has been chosen as the mandatory-to-implement MAC for IP security
• Used in other Internet protocols such as Transport Layer Security (TLS) and Secure Electronic
Transaction (SET)
HMAC
• HMAC(K,m) = H( (K+ ⊕ opad) || H((K+ ⊕ ipad) || m ) ), where
• H : is a cryptographic hash function, composed of multiple rounds with operations AND, OR, XOR,
NOT, and SHIFT. Very efficient to compute.
• K: is the secret key,
• M: is the message to be authenticated,
• K+ : is another secret key, derived from the original key K (by padding K to the right with extra
zeroes to the input block size of the hash function, or by hashing K if it is longer than that block
size,
• || denotes concatenation,
• opad is the outer padding (0x5c5c5c…5c5c, one-block long constant),and
• ipad is the inner padding (0x363636…3636, one-block long constant).
HMAC
Hash functions in practice
• MD5
• 128-bit output
• Introduced in 1991…collision attacks found in 2004…several extensions and improvements
since then
• Still widely deployed(!)
• SHA-1
• 160-bit output
• No collisions known, but theoretical attacks exist
• SHA-2
• 256-/512-bit outputs
Secure Hash Algorithm (SHA)
• SHA was originally developed by NIST
• Published as FIPS 180 in 1993
• Was revised in 1995 as SHA-1
• Produces 160-bit hash values
• NIST issued revised FIPS 180-2 in 2002
• Adds 3 additional versions of SHA
• SHA-256, SHA-384, SHA-512
• With 256/384/512-bit hash values
• Same basic structure as SHA-1 but greater security
• The most recent version is FIPS 180-4 which added two variants of
SHA-512 with 224-bit and 256-bit hash sizes
Comparison of SHA Parameters
Notes: 1. All sizes are measured in bits.
2. Security refers to the fact that a birthday attack on a message digest of
size n produces a collision with a work factor of approximately 2n/2.
SHA-3
• SHA-2 shares same structure and mathematical operations as its
predecessors and causes concern
• Due to time required to replace SHA-2 should it become
vulnerable, NIST announced in 2007 a competition to produce
SHA-3
Requirements:
• Must support hash value lengths of 224, 256,384, and 512 bits
• Algorithm must process small blocks at a time instead of
requiring the entire message to be buffered in memory before
processing it
CMAC (Cipher-based MAC)
• “Hashless” MAC
• Uses an encryption algorithm (DES, AES, etc.) to generate MAC
• Based on same idea as cipher block chaining
• Compresses result to size of single block (unlike encryption
CBC CMAC Overview
CMAC Facts
• Advantages:
• Can use existing encryption functions
• Encryption functions have properties that resist preimage and collision
attacks
• Most exhibit strong avalanche effect – minor change in message gives
great change in resulting MAC
• Disadvantage:
• Encryption algorithms (particularly when chained) can be much slower
than hash algorithms
30
Encryption + integrity
➢simultaneously protect confidentiality and authenticity of communications
⚫often required but usually separate
➢approaches
⚫Hash-then-encrypt: EK(M || H(M))
⚫MAC-then-encrypt: EK2(M || MACK1(M))
⚫Encrypt-then-MAC: (C=EK2(M), T=MACK1(C)
⚫Encrypt-and-MAC: (C=EK2(M), T=MACK1(M)
Replay attacks
• A MAC inherently cannot prevent replay attacks
• Replay attacks must be prevented at a higher level of the protocol!
• (Note that whether a replay is ok is application-dependent.)
• Replay attacks can be prevented using nonces, timestamps, etc.