0% found this document useful (0 votes)

8 views19 pages

Chapter 2

Chapter 2 covers signal analysis in the context of artificial intelligence, focusing on audio processing techniques such as Pulse Code Modulation (PCM), time framing, Fourier Transform, and Mel-frequency Cepstral Coefficients (MFCC). It explains the conversion of analog audio to digital form, the segmentation of audio signals into frames for analysis, and the mathematical tools used to analyze frequency content. Additionally, it discusses the applications of these techniques in fields like audio analysis, speech processing, and machine learning.

Uploaded by

4gvq4qkyqq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views19 pages

Chapter 2

Uploaded by

4gvq4qkyqq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Chapter-2

Signals analysis
Selected topics in Artificial Intelligence
Summer semester -2025
Content
• Raw Data and PCM
• Time Framing
• Fourier Transform
• What is a Spectrogram
• Feature Extraction – MFCC
1 Raw Data and PCM

• How Audio is Digitized?

Before analyzing a signal in the
frequency domain, it must first
be converted from analog to
digital form. This process
involves sampling and quantiz
ation, typically implemented
using Pulse Code Modulation.
1.2 What is Pulse Code Modulation (PCM)?

• Pulse Code Modulation (PCM) is the standard method for

converting analog audio signals into digital form.
It involves three steps:
1. Sampling the signal at regular time intervals,
2. quantizing each sample to the nearest discrete level,
3. and encoding the result into a binary stream (e.g., 010110...).
The final output is a digital signal that can be stored,
transmitted, or analyzed by computers.
2. Time Framing
In audio processing, time framing involves segmenting a
continuous audio signal into smaller, overlapping or non-
overlapping blocks called frames. These frames are then
analyzed individually, allowing for processing of time-
varying characteristics of the audio. A typical frame
duration is 20-40 milliseconds, with a frame hop (the
amount of time advanced between frames) often around
10 milliseconds.
2.1 Steps in Time Framing and Preprocessing for Spectral Analysis

• Signal division: The audio signal, which is a continuous

waveform, is divided into frames.
• Frame length and hop: Each frame has a specific length (e.g., 25
milliseconds) and a frame hop (e.g., 10 milliseconds). The hop
determines how much the frame advances from one to the next.
Overlapping frames are common to capture transitions more
smoothly.
• Windowing: A window function is often applied to each frame to
reduce discontinuities at the edges of the frame and minimize
spectral leakage when performing Fourier analysis.
Tutorial for frame blocking

• A signal is sampled at 12KHz, the frame size is chosen to be 20

ms, and adjacent frames are separated by 5ms.
Calculate N and m.

(ans: N=240, m = 6 0 . )
• Repeat above when adjacent frames do not
overlap.
(ans: N=240, m=240.)
Frame Length (in samples)

• Frame Length (in samples)

• N= Sampling Rate × Frame Duration (in seconds)

• N: Number of samples in one frame
• Sampling Rate: Samples per second (e.g., 22,000
Hz)
• Frame Duration: Frame size in seconds (e.g., 15
ms = 0.015 s)
Frame Overlap (in samples)
• Overlap=Overlap Ratio×N

• Overlap Ratio: Fraction of the frame that

overlaps with the next (e.g., 0.40 for 40%)
• N: Frame length in samples
Frame Hop (Step Size)
m=N−Overlap

• m: Step size – how far we move forward to start

the next frame
• NN: Frame length
• Overlap: Number of overlapping samples between
frames
Class exercise
For a 22-KHz/16-bit sampling speech wave, the frame size is 15 ms and
the frame overlapping. The period is 40% of the frame size.
Calculate N and m.
Answer: Number of samples in one frame (N) = 15 ms *(1422k)=330
Overlapping samples = 132, m=N-132=198.000
Overlapping time = 132 * (1/22k)=6ms;
Time in one frame= 330* (1/22k)=15ms.
3. Fourier Transform
The Fourier Transform is a mathematical tool that decomposes a
function (often a signal) into its constituent frequencies. It takes
information from the "time domain" or "spatial domain" and
transforms it into the "frequency domain,". This transformation is
widely used in various fields like signal processing, image analysis,
and physics.
Applications of Fourier Transform :

• Signal Processing: In audio processing, it helps identify

the frequencies present in a sound, allowing for filtering
or manipulation of specific frequencies. In radio and
telecommunications, it's used to analyze and process
signals.

• Image Analysis: In image processing, it can reveal the

spatial frequencies or patterns within an image, helping
with tasks like edge detection or noise reduction.

• Physics
What is a Spectrogram?
Definition:
• A spectrogram is a visual representation of how
the frequency content of a signal changes over time.
It is a powerful tool in audio signal processing, particularly
for analyzing speech and music.
Spectrogram Axes:
• X-axis (Horizontal): Time
• Y-axis (Vertical): Frequency
• Color/Intensity: Represents the power
(amplitude) or energy of the signal at a given time and
frequency
spectral envelope

• The spectral envelope is the general shape of the frequency

spectrum at a given time.
• By observing how this shape evolves, we can understand how the
characteristics of the sound (e.g., pitch, formants) change over
time.
Applications:

Spectrograms are widely used in various fields, including:

1. Audio analysis: Identifying different sounds (speech, music,
etc.) and their characteristics.
2. Speech processing: Analyzing vocal sounds, understanding
speech patterns, and developing speech recognition systems.
3. Machine learning: Extracting features from audio signals for
tasks like audio classification and speaker recognition.
4. Feature Extraction – MFCC

MFCC stands for Mel-frequency Cepstral Coefficients. It’s a feature

used in automatic speech and speaker recognition.
MFCCs are mathematical representations of the vocal tract
produced by humans as they speak. The process involves several
steps to capture the essential characteristics of human speech,
which are most discernible to the human ear.
How to compute MFCC?
To calculate MFCCs, we follow these steps:
• Pre-emphasize the signal: Amplify higher frequencies to balance the
spectrum.
• Framing: Break the signal into small, overlapping frames.
• Windowing: To soften the edges of each frame, apply a Hamming
window.
• FFT: Convert each frame from the time domain to the frequency
domain.
• Mel-filter bank: Apply overlapping triangular filters spaced according
to the Mel scale.
• Logarithm: To replicate the way a human ear reacts to sound strength,
take the logarithm of the filter bank outputs.
• DCT: Apply the DCT to the log Mel-spectrum to obtain the Mel-
frequency Cepstral Coefficients.

Aml CT2 4M
No ratings yet
Aml CT2 4M
8 pages
Biometrics Lecture Speech
No ratings yet
Biometrics Lecture Speech
38 pages
SNP201 Mini Project
No ratings yet
SNP201 Mini Project
7 pages
An Automatic Speaker Recognition System
100% (1)
An Automatic Speaker Recognition System
11 pages
DSP Lab Mini Project
No ratings yet
DSP Lab Mini Project
7 pages
Automatic Speaker Recognition Report Hiya
No ratings yet
Automatic Speaker Recognition Report Hiya
8 pages
FFT Research
No ratings yet
FFT Research
8 pages
Audio Noise Detection
No ratings yet
Audio Noise Detection
29 pages
08 Spectrogram
No ratings yet
08 Spectrogram
7 pages
Fourier Analysis of Audio Signal Report
No ratings yet
Fourier Analysis of Audio Signal Report
5 pages
Spectral Modeling and Signal Processing Intro421
100% (3)
Spectral Modeling and Signal Processing Intro421
35 pages
Shazam Princeton ELE201
No ratings yet
Shazam Princeton ELE201
7 pages
MSC Data Science - 02 PDF
No ratings yet
MSC Data Science - 02 PDF
37 pages
10 Tutorial 2 Solution
No ratings yet
10 Tutorial 2 Solution
9 pages
SPIS Exam Notes
No ratings yet
SPIS Exam Notes
8 pages
Understanding Dynamic Signal Analysis AN1405-2
No ratings yet
Understanding Dynamic Signal Analysis AN1405-2
32 pages
MFCCs in Speech Recognition
No ratings yet
MFCCs in Speech Recognition
14 pages
Course Introduction: 1. Course Structure 2. DSP: The Short-Time Fourier Transform
No ratings yet
Course Introduction: 1. Course Structure 2. DSP: The Short-Time Fourier Transform
24 pages
FFT Simulation in LabView Guide
No ratings yet
FFT Simulation in LabView Guide
9 pages
Experiment No. 3: The Fourier Transform - An Audio Signal Is Comprised of Several Single-Frequency Sound
No ratings yet
Experiment No. 3: The Fourier Transform - An Audio Signal Is Comprised of Several Single-Frequency Sound
7 pages
Analysis of Audio Signal Using Various T Ef70b0cd
No ratings yet
Analysis of Audio Signal Using Various T Ef70b0cd
13 pages
MFCC Code
No ratings yet
MFCC Code
8 pages
Fourier Analysis and FFT
No ratings yet
Fourier Analysis and FFT
10 pages
13MFCC Tutorial
No ratings yet
13MFCC Tutorial
6 pages
Audproc 2
No ratings yet
Audproc 2
40 pages
Eng 6 Audio Signals: Bevan Baas, Andre Knoesen
No ratings yet
Eng 6 Audio Signals: Bevan Baas, Andre Knoesen
30 pages
Spearfinal 05
No ratings yet
Spearfinal 05
4 pages
Signal Processing
No ratings yet
Signal Processing
59 pages
Spectrum Analyzer
100% (2)
Spectrum Analyzer
45 pages
Mel Frequency Cepstral Coefficient (MFCC) - Guidebook - Informatica e Ingegneria Online
No ratings yet
Mel Frequency Cepstral Coefficient (MFCC) - Guidebook - Informatica e Ingegneria Online
12 pages
DT081A - Signal and Image Processing Lab 1 Report
No ratings yet
DT081A - Signal and Image Processing Lab 1 Report
20 pages
Voice Recognition Using MFCC Algorithm
No ratings yet
Voice Recognition Using MFCC Algorithm
4 pages
Mat Lab Project Report
No ratings yet
Mat Lab Project Report
12 pages
Digital Signal Processing
No ratings yet
Digital Signal Processing
15 pages
Article - Audio Intent Detection Classification Problem
No ratings yet
Article - Audio Intent Detection Classification Problem
4 pages
Week2 - Fourier Series - The Math Behind The Music - V1
No ratings yet
Week2 - Fourier Series - The Math Behind The Music - V1
5 pages
Department of Electronics 2020-2021: Prof. Shilpa Achaliya
No ratings yet
Department of Electronics 2020-2021: Prof. Shilpa Achaliya
15 pages
MFCC Computation for Speech Recognition
100% (2)
MFCC Computation for Speech Recognition
6 pages
FFT, Leakage, and Windowing Guide
No ratings yet
FFT, Leakage, and Windowing Guide
8 pages
MFCC Technique For Speech Recognition
No ratings yet
MFCC Technique For Speech Recognition
6 pages
Final Project Report
No ratings yet
Final Project Report
15 pages
Analysisof Speech Signal 29 TH October 2018
No ratings yet
Analysisof Speech Signal 29 TH October 2018
16 pages
Pitch Tracking: 1. Pitch Tracking 2. Spectral Approaches 3. Time Domain 4. Example Algorithms
No ratings yet
Pitch Tracking: 1. Pitch Tracking 2. Spectral Approaches 3. Time Domain 4. Example Algorithms
18 pages
3123-1730 - Spectrum Analyzer Basics
No ratings yet
3123-1730 - Spectrum Analyzer Basics
28 pages
EE-421 Digital Signal Processing Complex Engineering Problem
No ratings yet
EE-421 Digital Signal Processing Complex Engineering Problem
10 pages
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
No ratings yet
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
50 pages
Frequency Domain Processing with pfft~
100% (1)
Frequency Domain Processing with pfft~
14 pages
DSP 5
No ratings yet
DSP 5
4 pages
MATLAB Audio Processing Ho
No ratings yet
MATLAB Audio Processing Ho
7 pages
ML Assignment 2 Report
No ratings yet
ML Assignment 2 Report
59 pages
MATLAB Audio Processing Guide
No ratings yet
MATLAB Audio Processing Guide
27 pages
05 Tutorial 2
No ratings yet
05 Tutorial 2
3 pages
Introduction To Signal Processing
100% (1)
Introduction To Signal Processing
162 pages
Adaptive Noise Cancellation Report
No ratings yet
Adaptive Noise Cancellation Report
10 pages
Internship Report On Low Cost Acoustics Sensor For Traffic Detection - Kaviyarasu T - Nit Trichy
No ratings yet
Internship Report On Low Cost Acoustics Sensor For Traffic Detection - Kaviyarasu T - Nit Trichy
37 pages
16m Pilot Boat Specifications
No ratings yet
16m Pilot Boat Specifications
2 pages
HND Surveying Certification
No ratings yet
HND Surveying Certification
37 pages
Instruction Manual-Venus 4 Sport
No ratings yet
Instruction Manual-Venus 4 Sport
32 pages
VSXD 412 K
No ratings yet
VSXD 412 K
82 pages
LD Systems U308 IEM Wireless In-Ear Monitoring LD Systems
No ratings yet
LD Systems U308 IEM Wireless In-Ear Monitoring LD Systems
1 page
Brkewn 2024
No ratings yet
Brkewn 2024
135 pages
QN8035 Quintic
No ratings yet
QN8035 Quintic
39 pages
Fiberlink 7130 Wideband Video Audio
No ratings yet
Fiberlink 7130 Wideband Video Audio
2 pages
United States Patent 1322622
No ratings yet
United States Patent 1322622
6 pages
Am DSB Transmitter For Hams
100% (1)
Am DSB Transmitter For Hams
2 pages
TDA4650 Philips Elenota - PL PDF
No ratings yet
TDA4650 Philips Elenota - PL PDF
13 pages
RT100 en
No ratings yet
RT100 en
44 pages
LSZH Apch17 Rwy34 Loc
No ratings yet
LSZH Apch17 Rwy34 Loc
1 page
WMN Book
No ratings yet
WMN Book
170 pages
Funai Lcda2006 LCD TV SM
No ratings yet
Funai Lcda2006 LCD TV SM
55 pages
C115 A DC1051-Rev01 IT-EN-ES
No ratings yet
C115 A DC1051-Rev01 IT-EN-ES
596 pages
Service Manual: HS-TX416 HS-TX419
No ratings yet
Service Manual: HS-TX416 HS-TX419
23 pages
Designing A Microstrip Coupled Line Bandpass Filter: Ragani Taoufik, N. Amar Touhami, M. Agoutane
No ratings yet
Designing A Microstrip Coupled Line Bandpass Filter: Ragani Taoufik, N. Amar Touhami, M. Agoutane
4 pages
AR Baseband Appnote v0p6
No ratings yet
AR Baseband Appnote v0p6
18 pages
Waves: Ultrasonic Waves: Piezo Electric Generator or Oscillator
100% (1)
Waves: Ultrasonic Waves: Piezo Electric Generator or Oscillator
3 pages
Practical Guide To Radio-Frequency Analysis and Design
No ratings yet
Practical Guide To Radio-Frequency Analysis and Design
49 pages
EC461 Microwave Devices and Circuits
No ratings yet
EC461 Microwave Devices and Circuits
2 pages
TheCountersteerer Free Detailed Study Guide For TC RPAS
No ratings yet
TheCountersteerer Free Detailed Study Guide For TC RPAS
43 pages
Av-362 - Cepreport - Am Transceiver
No ratings yet
Av-362 - Cepreport - Am Transceiver
8 pages
Investigation and Optimization of Electrical Tilt and Azimuth For Addis Ababa LTE Network - Taddege Assefa
No ratings yet
Investigation and Optimization of Electrical Tilt and Azimuth For Addis Ababa LTE Network - Taddege Assefa
65 pages
Aux Giulietta
No ratings yet
Aux Giulietta
4 pages
Z-Wave Dimmer Module Guide
No ratings yet
Z-Wave Dimmer Module Guide
2 pages
CNS Unit 2 MCQs
No ratings yet
CNS Unit 2 MCQs
11 pages
Stereo Audio Converter with Noise Rejection
100% (1)
Stereo Audio Converter with Noise Rejection
4 pages
Quartz Trimoffset
No ratings yet
Quartz Trimoffset
1 page

Chapter 2

Uploaded by

Chapter 2

Uploaded by

Chapter-2

• How Audio is Digitized?

• Pulse Code Modulation (PCM) is the standard method for

• Signal division: The audio signal, which is a continuous

• A signal is sampled at 12KHz, the frame size is chosen to be 20

• Frame Length (in samples)

• N= Sampling Rate × Frame Duration (in seconds)

• Overlap Ratio: Fraction of the frame that

• m: Step size – how far we move forward to start

• Signal Processing: In audio processing, it helps identify

• Image Analysis: In image processing, it can reveal the

• The spectral envelope is the general shape of the frequency

Spectrograms are widely used in various fields, including:

MFCC stands for Mel-frequency Cepstral Coefficients. It’s a feature

You might also like