0% found this document useful (0 votes)

62 views30 pages

Lecture 2

Uploaded by

Rakshith Kamath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views30 pages

Lecture 2

Uploaded by

Rakshith Kamath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

ELEN 6820

Speech and audio signal processing

Instructor: Nima Mesgarani (nm2764)
3 credits
TA: Yi Luo (yl3364)
O ce hours: TBD
ffi
Course overview
• Brief history of speech recogni on
• Discrete Signal Processing (DSP) overview
• Pa ern recogni on and deep learning overview
• Speech signal produc on
• Speech signal representa on
• Auditory Scene Analysis, speech enhancement and separa on
• Speech processing in the auditory system
• Acous c modeling
• Sequence recogni on and Hidden Markov Models
• Language models
• Music signal processing
tt
ti
ti
ti
ti
ti
ti
ti
Homeworks
• HW1: Discrete signal processing (wri en) (W2)
• HW2: Neural networks and voice ac vity detec on (programming) (W3&4)
• HW3: Speech signal produc on and representa on (wri en) (W5)
• HW3: Speech enhancement and separa on (programming) (W6)
• HW4: Acous c event detec on and Speaker iden ca on (programming)
(W7&8)
• HW5: Phoneme recogni on and automa c speech recogni on
(programming) (W9&10)
• Final project (programming) (W11-13)
ti
ti
ti
ti
ti
tt
ti
ti
ti
ti
ti
fi
ti
tt
ti
Week topic HW

1 Introduc on and history -

2 Discrete signal processing DSP (W)

3 Machine learning 1 Neural network and VAD (P)

4 Machine learning 2 -

5 Speech signal produc on Speech produc on (W)

6 Speech signal representra on Speech enhancement (P)

Speech enhancement and
7 -
separa on
8 Human speech percep on Acous c event detec on (P)

9 Acous c modeling -

10 Sequence modeling and HMMs Phoneme recogni on and ASR (P)

11 Language modeling -

12 Automa c speech recogn on Projcet

13 Music signal processing -

ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
Course evalua on

• Two wri en homework (20%)

• Four programming homework (60%)
• Final project (20%)

• Late submission: 10% penalty per day

tt
ti
Final project

• Preferably choose the prede ned course project

• Alterna vely, de ne a project that is similar in scope and

workload, in discussion with me and Yi
ti
fi
fi
How to install Python with graphical interface on Mac/
Windows/Linux
Install Jupyter Notebook using Anaconda and cond
Download it from the following address and follow the instruction
www.anaconda.com/products/individual

• Anaconda will simultaneously install Python and Jupyter Notebook as well as

some necessary packages (e. g. numpy, scipy, etc.)
• You can either use graphical installer or use command line installer in Mac OS
• If you have Windows 10 and want to use bash commands, it is highly
recommended that you enable Linux subsystem bash environment and
install a Linux version of Anaconda on it (using command line installer)
• A er installing Anaconda, you can either run Jupyter Notebook from
Anaconda app or run the command, jupyter notebook, in terminal
• You can also use other environments for interac ng with Python, but the one
recommended for this course is Jupyter Notebook, specially if you want to
run your codes on a server (e.g. for Tensor ow)

• More informa on available on: h p://jupyter.readthedocs.io/en/latest/

index.html
• More instruc ons and tutorials for star ng Python will be taught next session
• The assignments will be checked in Jupyter Notebook using Python 3
ft
ti
ti
tt
ti
fl
ti
Signal processing
background (chapter 2)

Speech communica on

Produc on Percep on

Ear drum

Cocktail party problem, Cherry, (1953)

ti
ti
ti
Discrete Signal Processing
• Discrete me signals and systems

• Discrete me Fourier transform, z-transform

• Digital lters, IIR and FIR

• Sampling theorem, changing the sampling rate

• Emphasis on intui on
fi
ti
ti
ti
Discrete me Signals and
Systems
• Speech signal: represen ng con nuously varying pa ern
as func ons of a con nous variable t, which represents
me.

• Discrete signal: x[n] = xa(nT), where T = 1/Fs

• Telephone bandwidth speech: Fs = 6.4KHz

• Wide-band speech: Fs = 16KHz

ti
ti
ti
ti
ti
ti
tt
Few basics

• Unit impulse func on, unit step func on, exponen al

sequence

• Convolu on
ti
ti
ti
ti
Transforma ons of Signals and
Systems

• Fourier Transform

• z-Transform
ti
The Con nous-Time Fourier
Transform

• What did Fourier show?

• Whats the big deal? 1822

• Decomposing signals into fast and slow components

• Importance of sine func on for linear systems

ti
ti
The z-Transform
• A powerful tool for analyzing linear systems of di eren al
equa ons

• De ni on

• Inverse z-Transform

• Examples: delayed unit response, box pulse, exponen al

• Proper es of z-Transform: linearity, shi , exponen al

weigh ng, Linear weigh ng, convolu on, mul plica on of
sequences
fi
ti
ti
ti
ti
ti
ti
ft
ti
ff
ti
ti
ti
ti
The Discrete-Time Fourier Transform
Discrete-Time Fourier Transform
+∞

!
(e ) = x[n]e−jωn
 jω


 X
 n=−∞

 & π
 1

 x[n] = 2π
X (ejω )ejωn dω
−π
+∞ ""
! "
"
• De condition
• Sufficient ni on, periodic
for convergence: "
" x[n] " < +∞
"
n=−∞

• Although x[n] is discrete, X (ejω ) is continuous and periodic with period 2π.
• Inverse DTFT
• Convolution/multiplication duality:

y[n] = x[n] ∗ h[n]
• DTFT of a Cosine Signal


Y (ejω ) = X (ejω )H(ejω )





 y[n] = x[n]w[n]

& π
fi
ti
The Discrete Fourier Transform

• Sampling the DTFT: Discrete Fourier Transform (DFT)

Prac cal implica ons

• Periodic signals, or, nite length sequences

• What frequency each DFT corresponds to?

• Circular shi of x[n]

• Boundary condi ons, importance of windowing

ti
ft
ti
fi
ti
Dependent Fourier
e-Dependent Transform)
Fourier Transform)

Create a nite length sequence:

w [ 50 - m ] w [ 100 - m ] w [ 200 - m ]
w [ 50 - m ] w [ 100 - m ] w [ 200 - m ]

x [ mx] [ m ]

windowing m m

00 nn == 50
50 nn
==100
100 n = 200
n = 200

+∞
!+∞
Xn (ejω
jω
)= ! w[n − m]x[m]e−jωm
−jωm
Xn (e ) = m=−∞ w[n − m]x[m]e
m=−∞
fixed, then it can be shown that:
fixed, then it can be shown that:
" π
1
Xn (ejω ) = 2π
" πW (ejθ )ejθn X (ej(ω+θ) )dθ
1
Xn (ejω ) = 2π
−π W (ejθ )ejθn X (ej(ω+θ) )dθ
−π
bove equation is meaningful only if we assume that X (ejω ) represents
er transform
ove equationofisa meaningful
signal whoseonly
properties continuethat
if we assume X (ejω
outside the) repres
windo
ytransform
that the signal is zero whose
of a signal outside properties
the window.continue outside the wi
that
der forthe signal
Xn (e jω
) to is zero outside
correspond the
to X (e jω window.
), W (ejω ) must resemble an impu
jω
fi
Rectangular window
Rectangular Window

w[n] = 1, 0≤n≤N −1

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 4

Hamming window
Hamming Window

2πn
" !
w[n] = 0.54 − 0.46cos , 0≤n≤N −1
N −1

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 5

Comparison of Windows

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 6

Spectrogram

• Use a sliding window over the signal, and display the

magne te of the DFT for each step.

• Large vs. Small window?

• Overlapping vs. non-overlapping?

ti
A Wideband Spectrogram

Two plus seven is less than ten

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 8

A Narrowband Spectrogram

Two plus seven is less than ten

Tradeoff between DFT length (temporal resolution)

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 9

and spectral resolution

Digital lters

• A digital lter is a discrete- me shi -invariant system

• Convolu on equa on: unit response, transfer func on,

system func on

• All useful systems sa sfy the linear di erence equa on

ti
fi
ti
fi
ti
ti
ti
ft
ff
ti
ti
FIR vs. IIR lters

• Linear vs. nonlinear phase

• Large vs. small impulse response dura on

fi
ti
Sampling
• Represent a con nous me signal as a sequence of
numbers

• The Sampling Theorem

ti
ti
Changing the sampling rate of a signal

Course 01 - Introduction
No ratings yet
Course 01 - Introduction
56 pages
l4n JN Uhbh Hiunun Hbinun
No ratings yet
l4n JN Uhbh Hiunun Hbinun
36 pages
FFTandMatLab Wanjun Huang
No ratings yet
FFTandMatLab Wanjun Huang
26 pages
Matlab Exercises To Explain Discrete Fourier Transforms PDF
No ratings yet
Matlab Exercises To Explain Discrete Fourier Transforms PDF
9 pages
Lecture1 Signalsclassifications
No ratings yet
Lecture1 Signalsclassifications
24 pages
DSP Lecture 2
No ratings yet
DSP Lecture 2
77 pages
Spectral Modeling and Signal Processing Intro421
100% (3)
Spectral Modeling and Signal Processing Intro421
35 pages
Title Page
No ratings yet
Title Page
171 pages
Signal Processing
No ratings yet
Signal Processing
367 pages
Cs2403 Digital Signal Processing Notes
No ratings yet
Cs2403 Digital Signal Processing Notes
106 pages
Digital Filter Design For Audio Processing: Ethan Elenberg Anthony Hsu Marc L'Heureux
No ratings yet
Digital Filter Design For Audio Processing: Ethan Elenberg Anthony Hsu Marc L'Heureux
31 pages
BSP-L4-Discrete Time and System
No ratings yet
BSP-L4-Discrete Time and System
49 pages
Digital Signal Processing by Krishna
No ratings yet
Digital Signal Processing by Krishna
303 pages
DSP1
No ratings yet
DSP1
175 pages
MATLAB Audio Processing Guide
No ratings yet
MATLAB Audio Processing Guide
27 pages
DSP1
No ratings yet
DSP1
64 pages
Pset 01
No ratings yet
Pset 01
15 pages
MATLAB Audio Processing Ho
No ratings yet
MATLAB Audio Processing Ho
7 pages
DSP Full Slides
No ratings yet
DSP Full Slides
911 pages
Digital Signal Processing Lab Manual Updated
No ratings yet
Digital Signal Processing Lab Manual Updated
85 pages
Digital Signal Processing - Lecture 1 - Introduction
No ratings yet
Digital Signal Processing - Lecture 1 - Introduction
69 pages
Digital Signal Processing by S Salivahanan PDF Free
No ratings yet
Digital Signal Processing by S Salivahanan PDF Free
655 pages
Brief Notes On Signals and Systems: C. Sidney Burrus
No ratings yet
Brief Notes On Signals and Systems: C. Sidney Burrus
75 pages
Digital Signal Procesing
100% (1)
Digital Signal Procesing
800 pages
Alan v. Oppenheim, Ronald W. Schafer - Digital Signal Processing (1975, Prentice-Hall) - Libgen - Li
67% (3)
Alan v. Oppenheim, Ronald W. Schafer - Digital Signal Processing (1975, Prentice-Hall) - Libgen - Li
600 pages
Lecture 4 Slides DFT Sampling Theorem
No ratings yet
Lecture 4 Slides DFT Sampling Theorem
32 pages
Course Notes v17
No ratings yet
Course Notes v17
82 pages
DSP - 24 10 2022
No ratings yet
DSP - 24 10 2022
209 pages
DSP Lecture1
No ratings yet
DSP Lecture1
6 pages
Brainkart - 211 - IT6502 Digital Signal Processing - Notes
No ratings yet
Brainkart - 211 - IT6502 Digital Signal Processing - Notes
112 pages
Digital Signal Processing Lecture-2 29 July, 2008, Tuesday
No ratings yet
Digital Signal Processing Lecture-2 29 July, 2008, Tuesday
51 pages
DSP Fundamentals for ECE Students
No ratings yet
DSP Fundamentals for ECE Students
52 pages
Doblinger Matlab Course
No ratings yet
Doblinger Matlab Course
99 pages
1-dsp SBT PDF
No ratings yet
1-dsp SBT PDF
68 pages
Linear Algebra, Signal Processing, and Wavelets - A Unified Approach - MATLAB Version (Instructor's Solution Manual) (Solutions)
No ratings yet
Linear Algebra, Signal Processing, and Wavelets - A Unified Approach - MATLAB Version (Instructor's Solution Manual) (Solutions)
209 pages
ASP Exercises 1
No ratings yet
ASP Exercises 1
12 pages
Digital Signal Processing
No ratings yet
Digital Signal Processing
165 pages
Speech Signal Processing: A Handbook of Phonetic Science
No ratings yet
Speech Signal Processing: A Handbook of Phonetic Science
24 pages
Digital Filter Design Lab Guide
No ratings yet
Digital Filter Design Lab Guide
8 pages
EE501 Adaptive Filter Design: Instructor: Dr. Farhan Khalid
No ratings yet
EE501 Adaptive Filter Design: Instructor: Dr. Farhan Khalid
26 pages
Analysis of Audio Signal Using Various T Ef70b0cd
No ratings yet
Analysis of Audio Signal Using Various T Ef70b0cd
13 pages
Brief Notes On Signals and Systems 7.2
No ratings yet
Brief Notes On Signals and Systems 7.2
77 pages
Ec3492-Digital Signal Processing Laboratory
100% (1)
Ec3492-Digital Signal Processing Laboratory
80 pages
Digital Signal Processing Overview
No ratings yet
Digital Signal Processing Overview
45 pages
Sampling Analog
No ratings yet
Sampling Analog
33 pages
Dsap Lab Report 077bei045
No ratings yet
Dsap Lab Report 077bei045
27 pages
Fourier Transform Lab Guide
No ratings yet
Fourier Transform Lab Guide
9 pages
DSP Salivahanan
100% (1)
DSP Salivahanan
655 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
Lecture 4
No ratings yet
Lecture 4
50 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Lecture 3
No ratings yet
Lecture 3
49 pages
Scanned Document Pages
No ratings yet
Scanned Document Pages
19 pages
Midpaper
No ratings yet
Midpaper
16 pages
Integer Linear Programming
No ratings yet
Integer Linear Programming
83 pages
Knapsack Encryption Algorithm in Cryptography
No ratings yet
Knapsack Encryption Algorithm in Cryptography
2 pages
Industrial Application of Microcontrollers in Agriculture
No ratings yet
Industrial Application of Microcontrollers in Agriculture
2 pages
Linear Programming Guide
No ratings yet
Linear Programming Guide
5 pages
Face Image Analysis by Unsupervised Learning Scribd PDF Download
100% (19)
Face Image Analysis by Unsupervised Learning Scribd PDF Download
17 pages
What Is Machine Learning?: Lis Sulmont
No ratings yet
What Is Machine Learning?: Lis Sulmont
51 pages
Assessment and Project Plan - Preparing Data 3809
No ratings yet
Assessment and Project Plan - Preparing Data 3809
2 pages
DSA Revision Time Table
No ratings yet
DSA Revision Time Table
3 pages
DTI400 Presentation Template 4thsem
No ratings yet
DTI400 Presentation Template 4thsem
12 pages
Question Bank
No ratings yet
Question Bank
1 page
Encoder & Decoder
No ratings yet
Encoder & Decoder
7 pages
Modern Control Final
No ratings yet
Modern Control Final
48 pages
AES Basic
No ratings yet
AES Basic
13 pages
Solving Linear Equations Involving Fractions
No ratings yet
Solving Linear Equations Involving Fractions
35 pages
Digital Design of FIR LPF Filter
No ratings yet
Digital Design of FIR LPF Filter
12 pages
Mathematical Modeling of Interacting and Non Interacting Tank System
No ratings yet
Mathematical Modeling of Interacting and Non Interacting Tank System
7 pages
Cryptography Module 1 Part 1 Notes
No ratings yet
Cryptography Module 1 Part 1 Notes
17 pages
Uber Price Prediction
No ratings yet
Uber Price Prediction
6 pages
Information Entropy Explained
No ratings yet
Information Entropy Explained
3 pages
Aircraft Aileron Bracket Optimization
No ratings yet
Aircraft Aileron Bracket Optimization
14 pages
Fourier Analysis with MATLAB
No ratings yet
Fourier Analysis with MATLAB
7 pages
Computational Intelligence: (Introduction To Machine Learning)
No ratings yet
Computational Intelligence: (Introduction To Machine Learning)
55 pages
Mod 11 Solving Systems Project (Swenson, 12 - 23)
No ratings yet
Mod 11 Solving Systems Project (Swenson, 12 - 23)
5 pages
Deep Learning Based Multistep Solar Forecasting For PV Ramp-Rate Control Using Sky Images PDF
No ratings yet
Deep Learning Based Multistep Solar Forecasting For PV Ramp-Rate Control Using Sky Images PDF
10 pages
Separable Differential Equations
No ratings yet
Separable Differential Equations
12 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
3 pages
Linear Optimization - Max
No ratings yet
Linear Optimization - Max
186 pages
Air Compressibility Factor Calculation - Sample Problem - EnggCyclopedia
No ratings yet
Air Compressibility Factor Calculation - Sample Problem - EnggCyclopedia
3 pages
1131 and 1141 Maple Exam Sample Solutions
100% (7)
1131 and 1141 Maple Exam Sample Solutions
16 pages

Lecture 2

Uploaded by

Lecture 2

Uploaded by

ELEN 6820

Speech and audio signal processing

1 Introduc on and history -

2 Discrete signal processing DSP (W)

3 Machine learning 1 Neural network and VAD (P)

5 Speech signal produc on Speech produc on (W)

6 Speech signal representra on Speech enhancement (P)

10 Sequence modeling and HMMs Phoneme recogni on and ASR (P)

12 Automa c speech recogn on Projcet

13 Music signal processing -

• Two wri en homework (20%)

• Late submission: 10% penalty per day

• Preferably choose the prede ned course project

• Alterna vely, de ne a project that is similar in scope and

• Anaconda will simultaneously install Python and Jupyter Notebook as well as

• More informa on available on: h p://jupyter.readthedocs.io/en/latest/

Cocktail party problem, Cherry, (1953)

• Discrete me Fourier transform, z-transform

• Digital lters, IIR and FIR

• Sampling theorem, changing the sampling rate

• Discrete signal: x[n] = xa(nT), where T = 1/Fs

• Telephone bandwidth speech: Fs = 6.4KHz

• Wide-band speech: Fs = 16KHz

• Unit impulse func on, unit step func on, exponen al

• What did Fourier show?

• Whats the big deal? 1822

• Importance of sine func on for linear systems

• Examples: delayed unit response, box pulse, exponen al

• Proper es of z-Transform: linearity, shi , exponen al

Y (ejω ) = X (ejω )H(ejω )

• Sampling the DTFT: Discrete Fourier Transform (DFT)

• Periodic signals, or, nite length sequences

• What frequency each DFT corresponds to?

• Circular shi of x[n]

• Boundary condi ons, importance of windowing

Create a nite length sequence:

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 4

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 5

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 6

• Use a sliding window over the signal, and display the

• Large vs. Small window?

• Overlapping vs. non-overlapping?

Two plus seven is less than ten

6.345 Automatic Speech Recognition (2003) Speech Signal Representaion 8

Two plus seven is less than ten

Tradeoff between DFT length (temporal resolution)

and spectral resolution

• A digital lter is a discrete- me shi -invariant system

• Convolu on equa on: unit response, transfer func on,

• All useful systems sa sfy the linear di erence equa on

• Linear vs. nonlinear phase

• Large vs. small impulse response dura on

• The Sampling Theorem

You might also like