In 1666 Sir Isaac Newton discovered that when a beam of sunlight passes through a glass prism, the
emerging beam is split into a spectrum of colours
What Determines the Color We See?
Color perception depends on the wavelengths of visible light that are reflected (not absorbed)
by an object.
Example:
o A green object reflects light in the 500–570 nm range and absorbs other wavelengths.
🌟 Chromatic vs Achromatic Reflection:
Type of Reflection Wavelength Behavior What We See
Balanced reflection across 400–700 nm All visible wavelengths equally reflected White
Selective reflection (e.g., 500–570 nm) Only specific range reflected Color (e.g., green)
No reflection (complete absorption) No visible light reflected Black
🎨 Key Properties of Chromatic Light:
Chromatic light = Colored visible light
It spans 400 to 700 nm (from violet to red).
Transitions between colors are smooth, not abrupt (i.e., the spectrum is continuous).
Cones and Human Color Vision
Humans perceive color through cone cells in the retina. These cones are photoreceptor cells sensitive to
different ranges of wavelengths of light in the visible spectrum.
Approximately 65% of all cones are sensitive to red light 33% to green and only 2% are sensitive to Blue
light
Achromatic Light (No Color)
When color is ignored, the only attribute left is intensity.
Gray level = A measure of intensity from black (0) to white (maximum).
📏 Three Quantities for Chromatic Light Quality
Quantity Description
Radiance Total energy emitted by a light source in all directions (objective measure).
Luminance Amount of energy perceived by the human eye from a direction (what we see).
Quantity Description
Subjective perception of how "bright" a light appears; varies by observer and is hard to
Brightness
quantify.
🌟 Think of it like this:
Radiance is what the source emits,
Luminance is what the eye receives,
Brightness is how you feel about it.
🎨 Additive Color Mixing (Light)
Based on Red, Green, Blue (RGB) primaries.
When lights are added, they form secondary colors:
Combination Result
Red + Green Yellow
Red + Blue Magenta
Green + Blue Cyan
All three (R + G + B) White light
Used in screens, projectors, digital displays.
🖌️Subtractive Color Mixing (Pigments/Inks)
Based on Cyan, Magenta, Yellow (CMY).
Used in printing, painting, dyes.
Each pigment subtracts (absorbs) its opposite color from white light.
Combination Result
Cyan + Yellow Green
Cyan + Magenta Blue
Magenta + Yellow Red
All three (C + M + Y) Black (in theory)
Subtractive = Removing wavelengths from white light.
Characteristics Used for Color Differentiation
When distinguishing colors, we typically use three main attributes:
1. Brightness
Similar to intensity in black-and-white (monochrome) images.
Refers to how light or dark a color appears.
It is achromatic, meaning it doesn't indicate color, just light level.
2. Hue
Represents the dominant wavelength in the light.
It’s the name of the color we perceive: red, blue, green, etc.
Example: A light dominated by ~500 nm is perceived as green.
3. Saturation
Refers to the purity of a color — how much it is diluted with white light.
High saturation = pure, vivid color
Low saturation = washed out, pastel-like color
Examples:
o Red = High saturation
o Pink (Red + White) = Low saturation
o Lavender (Violet + White) = Low saturation
🌈 Chromaticity = Hue + Saturation
Chromaticity expresses the quality of color without involving brightness.
It includes:
o Hue (what color it is)
o Saturation (how pure or faded the color is)
🎯 Chromaticity shows:
What the main color is (hue),
And how much it is diluted by white light (saturation).
What Are Colour Models?
Colour models are systems used to represent colors in a standardized way, typically using numerical
values.
They define a coordinate system in which each color is a point.
Used for both hardware (like monitors and printers) and software applications (like image
editing, computer vision, etc.).
📚 Types of Colour Models and Their Uses:
Colour Model Used In Purpose
RGB Screens, cameras Displays color using light (additive model)
CMY / CMYK Printing Uses ink or pigment (subtractive model)
HSI / HSV / HSL Image processing, design Matches human perception of color
🟥 RGB Colour Model (Additive Model)
Based on Red, Green, Blue as primary colors of light.
Represented as a 3D Cartesian cube:
Cube Structure:
Red, Green, Blue: at three corners
Cyan, Magenta, Yellow: at opposite corners (secondary colors)
Black: Origin (0, 0, 0)
White: Furthest corner (255, 255, 255)
Other colors: Anywhere inside the cube
RGB Image:
Composed of three component images: one for R, one for G, and one for B
Combined to produce the full-color image on screen
💾 Colour Depth
Colour depth = Number of bits used to represent a pixel
Common example: 24-bit image
o 8 bits per channel (R, G, B)
o Total colors: 28×28×28=16,777,2162^8 \times 2^8 \times 2^8 = \
mathbf{16,777,216}28×28×28=16,777,216 colors
o Called True Color or Full Color
🌐 Web-Safe Colours
Why needed? Different hardware may render colors slightly differently
Web-safe colours: Subset of 216 standardized colors that appear consistently across all
systems
Useful in:
o Web development
o Designing for accessibility
o Ensuring consistent appearance on older or limited-color displays
Why HSI Instead of RGB?
RGB is good for machines and hardware (screens, cameras).
HSI is better for humans because it separates color from lightness:
o We say “light blue” or “dark red”, not “30% red + 40% green…”
Why HSI Instead of RGB?
RGB is good for machines and hardware (screens, cameras).
HSI is better for humans because it separates color from lightness:
o We say “light blue” or “dark red”, not “30% red + 40% green…”
Key Concepts from Your Notes Simplified:
Hue is based on a color circle — 0° = red, 120° = green, 240° = blue.
Saturation = distance from the center of the circle (more colorful = farther)
Intensity = height up the vertical axis (black at bottom, white at top)
Colors with the same intensity lie in the same horizontal plane
Saturation is zero on the vertical axis (pure grays)
We would see a hexagonal shape with each primary colour separated by 120° and secondary
colours at 60°from the primaries
Why Use HSI?
Image enhancement: Easily adjust brightness without affecting hue.
Segmentation: Detect certain hues (like green plants or red traffic signs).
Perception-based: Closer to how humans describe and recognize color.
Converting RGB → HSI
Let:
R,G,B∈[0,1]R, G, B \in [0,1]R,G,B∈[0,1] (normalize if in 0–255)
All angles are in degrees
Converting HSI → RGB
Pseudo Color Image Processing – Intensity Slicing
This is a technique used to assign artificial colors to grayscale images to enhance features.
🧠 How it works:
Treat the grayscale image as a 3D surface where pixel intensity is height.
Place "horizontal slices" (planes) at certain intensity levels.
Assign different colors to each slice:
o e.g., pixels with intensity 0–50 → blue
o 51–100 → green
o 101–150 → yellow
o …and so on.
Morphological Operations
Segmentation is the process of dividing an image into meaningful regions — typically by separating
objects from the background or separating different objects within the image.
After segmentation, the resulting image often contains imperfections such as:
Small unwanted regions (noise)
Gaps or holes in objects
Rough or irregular edges
To clean and refine this segmented image, we apply morphological operations — fundamental tools in
image processing that focus on the shape and structure of objects in a binary or grayscale image.
Morphological image processing (or morphology) describes a range of image processing techniques that
deal with the shape (or morphology) of features in an image Morphological operations are typically
applied to remove imperfections introduced during segmentation, and so typically operate on bi-level
images
These techniques are used to:
Clean up noise
Fill gaps
Separate objects
Detect structure or boundaries
Structuring Element (SE):
A structuring element is a small binary matrix (e.g., 3×3 or 5×5) used to probe an image in
morphological operations.
It has:
A defined shape (e.g., square, cross, disk)
An origin (usually the center pixel)
✅ Fit vs Hit Concepts:
Concept Explanation Example Behavior
Fit All "on" (1) pixels in the SE must exactly cover "on" pixels in the image Used in Erosion
Hit Any "on" (1) pixel in the SE overlaps with an "on" pixel in the image Used in Dilation
Image Compression
Need for data (not information) compression
Data compression aims to reduce the amount of data needed for conveying some information
Variable amount of data can be used to deliver the same piece of information
1. Coding Redundancy
What it means: The number of bits used to store each pixel's intensity is more than necessary.
Example: If you always use 8 bits per pixel (values from 0 to 255), but your image only uses
values between 0 and 31, then you're wasting bits.
Goal: Use fewer bits to store the same information. This is often handled using entropy coding
like Huffman or Arithmetic coding.
🔹 2. Interpixel / Spatial / Temporal Redundancy
What it means: Pixels near each other (in space or time, like in video) are usually very similar.
Example: In an image of the sky, many blue pixels are nearly identical. Storing each one
separately is inefficient.
Goal: Take advantage of the similarity between pixels. Techniques like predictive coding or
transform coding (e.g., DCT in JPEG) help reduce this redundancy.
🔹 3. Psychovisual Redundancy
What it means: The human eye cannot detect all the details in an image—especially slight
changes in color or brightness.
Example: Removing high-frequency details or subtle color differences might not affect what
people see.
Goal: Remove information that doesn’t impact perceived image quality. This leads to lossy
compression (like JPEG), which sacrifices some accuracy for smaller file sizes.
🔹 What Is Differential Coding?
Differential coding doesn't encode the actual pixel values, but instead encodes the difference between:
current_pixel - predicted_pixel
Since neighboring pixels are often similar, the difference is small, and smaller differences can be encoded
using fewer bits.
Assumption: 8 Bits/Sample
This means each pixel originally takes 8 bits (values from 0 to 255).
🔹 Difference Signal Between Pixels
The difference signal is calculated like:
plaintext
CopyEdit
d(i) = x(i) - x(i-1)
Where:
x(i) is the current pixel
x(i-1) is the previous pixel
d(i) is the difference
Then: dynamic range: from 256 to 512
variance: actually much smaller
What is DPCM (Differential Pulse Code Modulation)?
DPCM is a predictive compression technique used to reduce the number of bits needed to represent a
signal (like an image or audio) by encoding the difference between the current value and a predicted
value.
✅ How It Works (Simple Steps):
1. Prediction:
Predict the current sample based on past sample(s).
Example (simple):
x^(i)=x(i−1)\hat{x}(i) = x(i-1)x^(i)=x(i−1)
where:
o x(i)x(i)x(i) = current actual value
o x^(i)\hat{x}(i)x^(i) = predicted value
2. Differencing:
Calculate the difference:
d(i)=x(i)−x^(i)d(i) = x(i) - \hat{x}(i)d(i)=x(i)−x^(i)
This value is usually small.
3. Quantization:
Quantize d(i)d(i)d(i) to reduce the number of bits (lossy step if quantization is coarse).
4. Encoding:
Encode the quantized difference using fewer bits.
5. Reconstruction:
On decoding, reconstruct using:
x(i)=x^(i)+d(i)x(i) = \hat{x}(i) + d(i)x(i)=x^(i)+d(i)
🔹 Why Use DPCM?
Benefit Explanation
Compression Neighboring values are similar, so the difference is small and compressible.
Efficiency Smaller values → fewer bits → efficient storage or transmission
Simple prediction Even just using previous sample as prediction gives decent results
1. Differential Coding in Image Standards
📷 JPEG
Lossless JPEG uses differential coding to predict pixel values and encode the difference.
DCT-based JPEG (lossy) uses DPCM for DC coefficients of 8×8 image blocks.
o DC coefficient = average brightness of the block
o DC values change slowly across blocks → encode difference between blocks efficiently
📹 Video Standards (H.261, H.263, MPEG)
Motion Compensated (MC) Coding is a form of predictive coding in the time domain.
o Predict the current frame from previous frames.
o Encode only the difference (residual) between the actual and predicted frame.
🔹 2. Purpose of Differential Coding
"To eliminate interpixel redundancy by coding only new information."
In images, adjacent pixels are often similar.
Instead of sending actual pixel values, send the difference from a prediction.
Differences are smaller and more compressible (especially with Huffman coding).
🔹 3. Types of DPCM
Type Prediction Source
1-D DPCM Uses previous pixels in same row
2-D DPCM Uses neighboring pixels in same and previous rows
3-D DPCM Uses neighboring pixels from previous frames (for video)
🔹 4. Error Propagation in DPCM
🔧 PCM (Pulse Code Modulation):
Each pixel is independently coded
Bit error affects only one pixel
🔧 DPCM:
Pixel prediction depends on previous decoded values
If a bit flips due to channel noise, the error propagates to future pixels
⚠️Impact:
More severe in 1-D DPCM than in 2-D or 3-D, because 1-D relies heavily on one pixel
Lower bit error rate (BER) is required for DPCM than PCM
Fourier series
Any function that periodically repeats itself can be expressed as a sum of sines and cosines of different
frequencies each multiplied by a different coefficient
we get closer and closer to the original function as we add more and more frequencies
Why bother going into frequency domain?
Frequency domain representation makes it easy to visualize some characteristics of images
It is easy to conceptualize filters in frequency domain
Once a filter is selected in the frequency domain, it is usually implemented in the spatial domain
Frequency domain steps: Transformation from spatial to frequency domain Image processing in
the frequency domain Inverse transformation back to the Spatial domain
What do frequencies mean in an image?
frequency domain representation gives us a measure of pixels distribution in an image
Low frequencies indicate and correspond to slow varying pixel values
High frequencies indicate high variation in the pixel values
A smooth wall painted with one color:
The color changes very slowly across the surface (if at all).
This is like low frequency — pixel values don’t change much.
A striped pattern with black and white lines:
The colors change very quickly between black and white.
This is like high frequency — pixel values vary rapidly.
CHECK FORMULA
When u=v=0, this corresponds to average value
Moving away from this point, the low frequencies correspond to slowing varying components in an
image
The higher frequencies correspond to faster gray level changes
Such relationships (although gross) can help establishing enhancement techniques in the frequency
domain
If the interval lengths of f(x) and h(x) are M and N respectively interval length for f(x)*h(x) will be M+N-1
Comparison: Butterworth vs. Ideal Lowpass Filter
Feature 🔴 Ideal Lowpass Filter 🔵 Butterworth Lowpass Filter
Abrupt (sharp cutoff at cutoff Smooth and gradual transition (controlled
Transition
frequency D0D_0D0) by order nnn)
Mathematical
Binary: 1 (pass) or 0 (block) Continuous values between 0 and 1
Expression
Frequency Ringing Causes ringing artifacts in the image Much less ringing or none
Real-world suitability Less realistic, introduces distortions More realistic, natural blur
You can control smoothness using filter
Order control No control over sharpness
order nnn
Implementation Easy but crude Slightly more complex, but better results
Feature 🔴 Ideal Lowpass Filter 🔵 Butterworth Lowpass Filter
Gaussian Low Pass Filter – Simple Explanation
A Gaussian Low Pass Filter (GLPF) is a type of filter used in frequency domain image processing to
blur an image or remove high-frequency noise, similar to Butterworth and Ideal lowpass filters — but
with even smoother and more natural results.
key Characteristics:
Feature Gaussian Low Pass Filter
Smoothness Extremely smooth transition
No ringing artifacts ✅ Eliminates ringing completely
Natural blur ✅ Very realistic and soft
Mathematical simplicity Simple and fast to compute
HIGH PASS FILTER
Yes — ringing artifacts can occur in high pass filters, especially depending on the type of high pass filter
used
Why Butterworth is preferred:
Less ringing than Ideal filter
Controllable sharpness using order n
Avoids unnatural edge effects and artifacts
Better suited for real-world images
Final Recommendation:
Filter Use If You Want
Ideal High Pass Fast, simple edge detection (but with artifacts)
Butterworth High Pass Controllable edge sharpening with acceptable quality
Gaussian High Pass Best quality: smoothest, most natural enhancement, no artifacts