0% found this document useful (0 votes)
145 views15 pages

Unit31 LZ78

This document discusses Lempel-Ziv compression techniques, specifically LZ78 encoding and decoding algorithms. It provides examples of encoding sample strings using LZ78 and calculating the number of bits transmitted. LZ78 is an adaptive compression technique that builds a dictionary of patterns from the input string. It outputs codewords consisting of a pointer to a matching string pattern and the next symbol. Decoding reconstructs the original string by looking up the codewords in the dictionary.

Uploaded by

dfghjk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views15 pages

Unit31 LZ78

This document discusses Lempel-Ziv compression techniques, specifically LZ78 encoding and decoding algorithms. It provides examples of encoding sample strings using LZ78 and calculating the number of bits transmitted. LZ78 is an adaptive compression technique that builds a dictionary of patterns from the input string. It outputs codewords consisting of a pointer to a matching string pattern and the next symbol. Decoding reconstructs the original string by looking up the codewords in the dictionary.

Uploaded by

dfghjk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Lempel-Ziv Compression Techniques

• Classification of Lossless Compression techniques

• Introduction to Lempel-Ziv Encoding: LZ77 & LZ78

• LZ78 Encoding Algorithm

• LZ78 Decoding Algorithm


Classification of Lossless Compression Techniques
Recall what we studied before:

• Lossless Compression techniques are classified into static, adaptive (or dynamic), and
hybrid.

• Static coding requires two passes: one pass to compute probabilities


(or frequencies) and determine the mapping, and a second pass to encode.

• Examples of Static techniques: Static Huffman Coding

• All of the adaptive methods are one-pass methods; only one scan of the
message is required.

• Examples of adaptive techniques: LZ77, LZ78, LZW, and Adaptive


Huffman Coding
Introduction to Lempel-Ziv Encoding
• Data compression up until the late 1970's mainly directed towards creating
better methodologies for Huffman coding.

• An innovative, radically different method was introduced in1977 by


Abraham Lempel and Jacob Ziv.

• This technique (called Lempel-Ziv) actually consists of two considerably


different algorithms, LZ77 and LZ78.

• Due to patents, LZ77 and LZ78 led to many variants:

LZ77 LZR LZSS LZB LZH


Variants
LZ78 LZW LZC LZT LZMW LZJ LZFG
Variants

• The zip and unzip use the LZH technique while UNIX's compress
methods belong to the LZW and LZC classes.
LZ78 Compression Algorithm
LZ78 inserts one- or multi-character, non-overlapping, distinct patterns of
the message to be encoded in a Dictionary.

The multi-character patterns are of the form: C0C1 . . . Cn-1Cn. The prefix of
a pattern consists of all the pattern characters except the last: C0C1 . . . Cn-1

LZ78 Output:

Note: The dictionary is usually implemented as a hash table.


Example 1: LZ78 Compression
Encode (i.e., compress) the string ABBCBCABABCAABCAAB using the LZ78 algorithm.

The compressed message is: (0,A)(0,B)(2,C)(3,A)(2,A)(4,A)(6,B)


Note: The above is just a representation, the commas and parentheses are not transmitted;
we will discuss the actual form of the compressed message later on in slide 12.
Example 1: LZ78 Compression (cont’d)
1. A is not in the Dictionary; insert it
2. B is not in the Dictionary; insert it
3. B is in the Dictionary.
BC is not in the Dictionary; insert it.
4. B is in the Dictionary.
BC is in the Dictionary.
BCA is not in the Dictionary; insert it.
5. B is in the Dictionary.
BA is not in the Dictionary; insert it.
6. B is in the Dictionary.
BC is in the Dictionary.
BCA is in the Dictionary.
BCAA is not in the Dictionary; insert it.
7. B is in the Dictionary.
BC is in the Dictionary.
BCA is in the Dictionary.
BCAA is in the Dictionary.
BCAAB is not in the Dictionary; insert it.
Example 2: LZ78 Compression
Encode (i.e., compress) the string BABAABRRRA using the LZ78 algorithm.

The compressed message is: (0,B)(0,A)(1,A)(2,B)(0,R)(5,R)(2, )


Example 2: LZ78 Compression (cont’d)
1. B is not in the Dictionary; insert it
2. A is not in the Dictionary; insert it
3. B is in the Dictionary.
BA is not in the Dictionary; insert it.
4. A is in the Dictionary.
AB is not in the Dictionary; insert it.
5. R is not in the Dictionary; insert it.
6. R is in the Dictionary.
RR is not in the Dictionary; insert it.
7. A is in the Dictionary and it is the last input character; output a pair
containing its index: (2, )
Example 3: LZ78 Compression
Encode (i.e., compress) the string AAAAAAAAA using the LZ78 algorithm.

1. A is not in the Dictionary; insert it


2. A is in the Dictionary
AA is not in the Dictionary; insert it
3. A is in the Dictionary.
AA is in the Dictionary.
AAA is not in the Dictionary; insert it.
4. A is in the Dictionary.
AA is in the Dictionary.
AAA is in the Dictionary and it is the last pattern; output a pair containing its index:
(3, )
LZ78 Compression: Number of bits transmitted
• Example: Uncompressed String: ABBCBCABABCAABCAAB
Number of bits = Total number of characters * 8
= 18 * 8
= 144 bits
• Suppose the codewords are indexed starting from 1:
Compressed string( codewords): (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)
Codeword index 1 2 3 4 5 6 7

• Each code word consists of an integer and a character:


• The character is represented by 8 bits.
• The number of bits n required to represent the integer part of the codeword with
index i is given by:

• Alternatively number of bits required to represent the integer part of the codeword
with index i is the number of significant bits required to represent the integer i – 1
LZ78 Compression: Number of bits transmitted (cont’d)

Codeword (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)


index 1 2 3 4 5 6 7
Bits: (1 + 8) + (1 + 8) + (2 + 8) + (2 + 8) + (3 + 8) + (3 + 8) + (3 + 8) = 71 bits

The actual compressed message is: 0A0B10C11A010A100A110B


where each character is replaced by its binary 8-bit ASCII code.
Example 1: LZ78 Decompression
Decode (i.e., decompress) the sequence (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)

The decompressed message is: ABBCBCABABCAABCAAB


Example 2: LZ78 Decompression
Decode (i.e., decompress) the sequence (0, B) (0, A) (1, A) (2, B) (0, R) (5, R) (2, )

The decompressed message is: BABAABRRRA


Example 3: LZ78 Decompression
Decode (i.e., decompress) the sequence (0, A) (1, A) (2, A) (3, )

The decompressed message is: AAAAAAAAA


Exercises

1. Use LZ78 to trace encoding the string


SATATASACITASA.

You might also like