CSE 126 Multimedia Systems P.
Venkat Rangan
Spring 2003
Lecture Note 5 (April 15)
JPEG Encoding
There are four main steps in the JPEG encoding scheme.
1. Picture Preparation
o Separate the Y,U, and V components (planes)
o Subsample both U & V by 4x4 pixel regions (i.e. each 4x4 region becomes 1
mega-pixel)
o Split each plane into 8x8 blocks (64 pixels for Y, 64 mega-pixels for U/V)
2. DCT (Discrete Cosine Transform)
o Transform encoding to reduce the size of bits required to represent each 8x8 block
o 64 coefficients (1 DC coefficient, 63 AC coefficients) produced
3. Quantization
o Non-uniform quantization applied to the DCT coefficients (higher resolution
given to DC and low frequency coefficients)
o Usually results in most of the higher frequency coefficients quantizing to a value
of 0.
4. Entropy Encoding
o Used to further reduce the amount of space required to store the JPEG image
o Run length coding can be used on the long sequences of zeroes produced by the
DCT
DCT
There are many transforms, most of which are very slow. This is important to consider
since video demands real-time encoding and decoding. The JPEG committee took
suggestions and empirically studied the use of several different transforms. Of the
transforms studied, DCT (Discrete Cosine Transform) proved superior.
In JPEG, DCT operates on one block at a time. Because there are 64 elements in an 8x8
block, this is called the 64-element or 64-coefficient DCT. The DCT transform operates
on this block in a left-to- right, top-to-bottom manner.
Formula for FDCT:
1 7 7
(2 y + 1) ∗ i ∗ π (2 x + 1) ∗ j ∗ π
S i, j = ∗ C i ∗ C j ∑∑ Px , y ∗ cos ∗ cos
4 x =0 y =0 16 16
where
1
Ci = when i = 0
2
1
Cj = when j = 0
2
Ci , C j = 1 otherwise
Px,y = pixel (or mega-pixel) value at location x, y in the 8x8 block
Notes about DCT:
The results of a 64-element DCT transform are 1 DC coefficient and 63 AC coefficients.
The DC coefficient represents the average color of the 8x8 region. The 63 AC
coefficients represent color change across the block. Low-numbered coefficients
represent low-frequency color change, or gradual color change across the region. High-
numbered coefficients represent high-frequency color change, or color which changes
rapidly from one pixel to another within the block. These 64 results are written in a zig-
zag order as follows, with the DC coefficient followed by AC coefficients of increasing
frequency.
DCT
8x8 pixel 8x8 block
block of DCT
coefficients
Zig-Zag sequencing:
Note that each diagonal line in this zig-zag sequence contains AC coefficients whose sum
is constant. For example, the coefficients {30, 21, 12, 03} all add to 3.
Why is this ordering important? Well, if you think of a block of 8x8 pixels out of a
coherent image, the pixels are likely to be very similar. If you run DCT on 64 pixels
which are very similar, you will get a DC coefficient and some values for the low-
frequency AC coefficients; the remaining coefficients will likely be at or near zero. Try
to imagine creating an image out of pixels that wildly vary from their neighbors. The
resulting image will more than likely not make much sense, it will just be a mess of dots.
To give you an idea of how small an 8x8 region is, consider the following example:
The 8x8 region of pixels highlighted above looks like this (magnified 1600 times)
As you can see, this region does not deviate much from its average color. In addition, the
change is slow and gradual across the block rather than sharp and abrupt from pixel to
pixel.
This observation about images allows us to place a much greater importance on the DC
and first few AC coefficients (beginning of zig-zag sequence) and it also allows us to
assume there will be little or no values in the high-frequency AC coefficients (remainder
of sequence).
Logically, if these values are of little importance we should be able to assign fewer bits to
them in order to achieve greater compression. This naturally leads us to the stages of
quantization and entropy encoding, which we will cover next time.
Two examples for DCT.
Example 1 we have a block of 8*8 with each pixel of red color
Using the formula for DCT we get
C0,0 = 1/8 * p ( p is a constant)
for all other i,j the cosine values cancel each other thus the i,j is zero
Example 2 we have a block of 8*8, the left half side of the block is red, the right half
side of the block is blue.