CS 411 : Data Compression
Lecture 10
Video Compression
Types of Video signal
   Two types of Video Signal
       Analog video
       Digital video
                                2
Analog Video
   Analog video is recorded as a signal.
   Analog video is still used because:
       There are a number of existing video clips already
        stored in analog format on video tapes.
       There is an abundance of equipment currently available
        for recording and playing analog video.
                                                                 3
Analog Video
   Analog Video disadvantage:
       The degradation of video quality after several tape dubbing.
       Video tape is difficult to store and easily damaged by dust and
        humidity.
       Since video tape is linear, we must fast-forward or rewind it to
        get to the video segment we are interested in viewing or
        digitizing.
            This can be time consuming.
            It can also be difficult to get the tape to stop at just the right video
             frame.
                                                                                    4
                                                                                        4
Analog Video
   Examples of analog formats:
       VHS: the most common video format
       S-VHS: the quality is better than that of VHS
       Hi-8: the quality is better than that of S-VHS
       Betacam SP: a high-quality, analog video
        format used in professional video editing
                                                         5
Analog Video Signal
   Analog video is transferred by analog signal.
   It contains the:
       luminance (brightness) and
       chrominance (color) of the image.
   Most TV still sent and received video as an analog signal.
   Three types of Analog Video Signal:
       Component
       Composite
       S-Video
                                                                 6
Digital Video
   Refers to video that is already in digital format.
   There is little or no degradation in quality when
    digital video is transferred to the computer
    because there is no conversion.
   DV cameras use tapes or other storage media to
    store the video and sound in digital format.
   Digital video is often used to capture content
    from movies and television to be used in
    multimedia.
   A video source (video camera, VCR, TV or
    videodisc) is connected to a video capture card in
    a computer.                                        7
Digital Video Signal
 As the video source is played, the analog
  signal is sent to the video card and
  converted into a digital file (including
  sound from the video).
 It is transferred by digital signal.
 In most multimedia applications, the video
  signals need to be in a digital form in
  order to:
       store them in the memory of a computer and
       to easily edit and integrate them with other
        media types.                                   8
Computer-based digital video
   Three significant advantages:
       It can be copied and reproduced without loss of quality.
       It can be manipulated easily – repositioned, resized, and
        recolor by a computer.
       It is easier to transmit over computer networks.
   Three disadvantages:
       It requires an enormous amount of computer storage
        space.
       It requires high transfer rates.
       Large file sizes and high transfer rates required for quality
        digital video, so the majority of the digital video currently
        available has made compromises that produce images
        lower in quality than those on VHS tapes.
                                                                    9
Video Quality
There are two factors that affect the quality of
digital video:
1. Frame rate
        the number of images displayed within a specified
         amount of time to convey a sense of motion.
        Frame rate per second (number of images displayed per
         second).
2.   Video resolution
        Video resolution refers to the image resolution of the
         frames in the video.
        Video resolution is measured in pixels per inch (PPI).
                                                                  10
                      Frame rate
   The frame rate has been a problem, with a
    lot of real-time video (such as video
    conferencing software) achieving rates of
    10 frames per second (fps) or less.
       The frame rates should be between 15 to 30
        fps for smooth movement.
       Frames displayed at a slower rate appear
        choppy.
   Television frame rates are:
       30fps for NTSC (American standard) and
       25fps for PAL (Europe, NZ standard).
                                                     11
Characteristics of digital video
 Resolution: provided that the frame size
  remains unchanged, the higher the video
  resolution, the better the quality and the
  larger the file size.
 Other important considerations for video
  delivery:
       The bandwidth,
       Processor speed,
       memory, and
       monitor size.
                                               12
Frame size
 The VGA standard: monitors with
  resolutions of 640 x 480 pixels.
 Need high image storage and processing
  power so usually frame sizes are less than
  640 x 480 pixels.
 Common frame sizes :
       640   x   480   for full screen VGA display
       320   x   240   quarter of a VGA display
       240   x   180   about a sixth of a VGA display
       160   x   120   sixteenth of a VGA display
                                                         13
Digital video data sizing
Digital video file size (in bytes) = F * C * R
* T.
Where:
     F = frame size (width x height)
     C = color depth (in bytes)
     R = frame rate (frames per second)
     T = time in seconds
                                                 14
Calculate Video File Size (Example 1)
   Example 1:
        Duration = 10 mins
        Frame rate = 25 fps
        Frame size = 160 by 120
        Color resolution = 8-bit
   Solution 1:
    Video file size = 600 sec x 25 fps x 160 x 120 x
    (8-bit/8)
    = 288,000,000 bytes
                                                       15
Calculate Video File Size (Example 2)
   Example 2:
        Duration = 10 sec
        Frame rate = 30 fps
        Frame size = VGA
        Color resolution = 2 bytes
        Sampling rate/Frequency=44.1 kHz
        Sound resolution=8-bit
        Channel=Stereo
   Solution 2:
    Video File Size =
    10 sec x 30 fps x 640 x 480 x 2 bytes+10 sec x 44100 Hz x
    8-bit/8 x 2
    = 184,320,000 + 882,000 bytes
    = 185,202,000 bytes
                                                            16
Ways to reduce video file size
   Reduce the size of the playback window -
    Internet -
       160 x 120 pixels.
   Decrease the number of colors,
       from 16 million to 256 or even 16 colors.
   Reduce the frame rate
       from 30 down to 15 or less frames per second
        but more jerky.
   Compress the file.
                                                       17
Digital Video File Formats
   AVI (Audio Video Interleave)
       Microsoft standard.
   MPEG (Moving Picture Experts Group)
       MPEG-4 is the global multimedia standard,
        delivering professional-quality audio and video
        streams over a wide range of bandwidths,
        from cell phone to broadband.
   WMV (Windows Media Video)
       Proprietary to the Windows operating system.
       Used by Windows Movie Maker.
                                                       18
Digital Video File Formats
   ASF (Advanced Systems Format)
       Formerly known as Advanced Streaming
        Format
       Mircosoft’s proprietary format for streaming
       Stores audio and video information
       Specially designed to run on networks
       Content is delivered to users as continuous
        flow of data; little waiting time will be
        experienced before playback begins
                                                       19
Motion in Video
   It is not an arbitrary concatenation of
    images, but a sequence of images
    carrying a coherent interpretation of
    natural scene
     Ordering is important
     Sampling rate is important
     The role of a single frame is less important
    due to the masking effect of HVS
                                                     20
How to Understand Video?
   Understand the source
       How to model the motion of a camera?
        (relatively easy)
       How to model the motion in the real
        world? (notoriously difficult)
   Understand the mechanism of time-
    varying image formation model
       Two sides: geometric and photometric
                                               21
Overview of Video Processing
                  Video
                Manipulation         Video
                                     Display
                  Video
                Compression          Video
    Video
  Acquisition                       Database
                   Video
  Computer      Transmission
  Graphics
                   Video       Computer
                  Analysis      Vision
                                               22
Two –Dimensional
Motion Estimation
                    23
General Consideration
                        24
Motion Representation
                        25
Notations
            26
Motion Estimation Criterion
                              27
Optimization Methods
                       28
What is new with motion estimation?
   The familiar way – Full search
   Full search is not so efficient
   Some of the most popular fast search
    algorithms:
      Three-step search
      Two dimension logarithm
      Four-step search
      Diamond search
      Hexagon search
      Orthogonal search
      And many more
So what is the best?
   There is a trade-off between the run time and
    the accuracy.
   Full search will be most accurate because of
    exhaustive search, but will require more time
   Fast search is faster but the accuracy will be
    reduced because of estimation algorithms.
   We implemented three of the most popular
    fast search algorithms for comparison:
       Three-step search
       Two dimension logarithm
       Four-step search
Block-Based Motion Estimation
                                31
Motion Computation
 Predictive search
 Look for match window within a given
  search window
       Match window – macro-block
       Search window – arbitrary window size
        depending how far away are we willing to look
   Displacement of two match windows is
    expressed by motion vector
Matching Methods
                                 N 1
   SSD metric           SSD   ( xi  yi ) 2
                                 i 0
                                 N 1
   SAD metric           SAD   | xi  yi |
                                 i 0
   Minimum error represents best match
       must be below a specified threshold
       error and perceptual similarity not always
        correlated
Block-Matching Algorithm
                           34
Example of Finding Minimal SSD
Example of Comparing Minimal SSD and
SAD
Exhaustive Block Matching Algorithm
(EBMA)
                                      37
               Fast BMA: 3-Step-Search
It is one of the earliest fast block matching algorithms. It runs as
follows:
1.Start with search location at center
2.Set step size S = 4 and search parameter p = 7
3.Search 8 locations +/- S pixels around location (0,0) and the location
(0,0)
4.Pick among the 9 locations searched, the one with minimum cost
function
5.Set the new search origin to the above picked location
6.Set the new step size as S = S/2
7.Repeat the search procedure until S = 1
The resulting location for S=1 is the one with minimum cost function and the
macro block at this location is the best match.
There is a reduction in computation by a factor of 9 in this algorithm. For p=7,
while ES evaluates cost for 225 macro-blocks, TSS evaluates only for 25 38
12/26/2022
macro blocks.
Fast BMA: 3-Step-Search
                      search 9+8+8=
                        25 points
                                      39
      Two Dimensional Logarithmic Search[edit]
TDLS is closely related to TSS however it is more accurate for estimating motion
vectors for a large search window size. The algorithm can be described as follows,
 1.Start with search location at the center
 2.Select an initial step size say, S = 8
 3.Search for 4 locations at a distance of S from center on the X and Y
 axes
 4.Find the location of point with least cost function
 5.If a point other than center is the best matching point,
       1. Select this point as the new center
       2. Repeat steps 2 to 3
 6.If the best matching point is at the center, set S = S/2
 7.If S = 1, all 8 locations around the center at a distance S are
 searched
 8.Set the motion vector as the point with least cost function
                                                                               40
Motion Picture Expert Group
(MPEG)
   General Information about MPEG
       Began in 1988; Part of Same ISO as JPEG
 MPEG-1/Video
 MPEG/Audio – MP3
 MPEG-2
 MPEG-4
 MPEG-7
 MPEG-21
MPEG Image Preparation
(Resolution and Dimension)
   MPEG defines exactly format
       Three components: Luminance and two
        chrominance components (2:1:1)
       Resolution of luminance comp:X1 ≤ 768; Y1 ≤
        576 pixels
       Pixel precision is 8 bits for each component
   Example of Video format: 352x240 pixels,
    30 fps; chrominance components:
    176x120 pixels
MPEG Image Preparation - Blocks
 Each image is divided into macro-blocks
 Macro-block : 16x16 pixels for luminance;
  8x8 for each chrominance component
 Macro-blocks are useful for Motion
  Estimation
MPEG Video Processing
   Intra frames (same as JPEG)
       typically about 12 frames between I frames
   Predictive frames
       encode from previous I or P reference frame
   Bi-directional frames
       encode from previous and future I or P frames
           I   B B P   B B P B B P B B   I
MPEG Video I-Frames
                      Intra-coded images
                      I-frames – points of
                      random access in
                      MPEG stream
                      I-frames use 8x8
                      blocks defined within
                      Macro-block
                      No quantization
                      table for all DCT
                      coefficients, only
                      quantization factor
MPEG Video P-Frames
Motion Estimation Method
                           Predictive coded frames
                           require information of
                           previous I frame and or
                           previous P frame for
                           encoding/decoding
                           For Temporary Redundancy
                           we determine last P or I frame
                           that is most similar to the
                           block under consideration
Motion Computation for P Frames
 Predictive search
 Look for match window within a given
  search window
       Match window – macro-block
       Search window – arbitrary window size
        depending how far away are we willing to look
   Displacement of two match windows is
    expressed by motion vector
MPEG Video B Frames
Bi-directionally Predictive-coded frames
MPEG Video Decoding
 Display Order
I1      B1   B2 P1     B3   B4   P2   B5   B6   P3   B7   B8   I2
     Decoding Order
I1     P1    B1   B2   P2 B3     B4   P3   B5   B6   I2   B7   B8