0% found this document useful (0 votes)
35 views19 pages

Acoustic Phonetics Overview

Phonetics study material for Class notes preparation, very useful

Uploaded by

sahilchavan014
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views19 pages

Acoustic Phonetics Overview

Phonetics study material for Class notes preparation, very useful

Uploaded by

sahilchavan014
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Acoustic Phonetics

Sanjukta Ghosh
Acoustic Phonetics
• Acoustic Phonetics is the study of the physical
properties of the speech sounds including
their description and analysis.
• Sound as a form of energy can be represented
by three physical characteristics: frequency,
amplitude and duration of the sound.
• These physical characteristics of sounds are
perceived as pitch of the voice, loudness and
length of the sound by human beings.
Periodic sounds
• Periodic sounds are those which repeat
themselves at regular intervals.
• A periodic sound may be simple or a sinusoid or a
complex periodic sound consisting of many
overlapping simple waves.
• A simple sinusoid is a sound coming from a
pendulum.
• A complex periodic sound is a speech sound of the
vowels, sounds of musical instruments like sitar,
guitar etc.
Aperiodic sound
• Aperiodic sound does not repeat itself at regular
interval.
• It is perceived as not having a clear pitch.
• It is often called noise.
• Sounds coming from a busy market place, sounds
of the wind and the sea, sounds of many
percussive instruments like drum etc are
aperiodic.
• In speech sounds consonants especially fricatives
are aperiodic sounds.
Frequency, amplitude, duration
• Frequency is measured in Hz, cycles per
second.
• Human ear can hear the sounds between
20Hz to 20000 Hz.
• Amplitude is measured in decibels (dB).
• Duration is measured in time represented by
millisecond.
Frequency of a speech sound
• Vocal folds vibrate at a frequency which is
known as fundamental frequency
• The oro-nasal and pharyngeal cavity act as
resonators or filters
• The fundamental frequency is filtered in the
pharyngeal and oro-nasal cavity
• The final frequency of a sound is produced as
a result of the above two
Source –filter model of speech
production
• Speech sounds are generated at a fundamental
frequency resulting from the vibration of the vocal
cord known as source of the sound
• For voiceless sounds there is no fundamental
frequency
• the supra-laryngeal vocal tract acts as the filter for
the speech sounds
• height of the tongue, tongue position, lip
rounding, shape of the mouth and the velum play
roles in filtering the sound
Sources of speech sound
• A Source of speech sounds is the input of
acoustic energy where the speech sound is
generated.
• Sources can be
• Voicing or vibration of vocal cords for all the
voiced sounds especially for the vowels
• Noise or source generated by some
obstruction creating air turbulence for the
consonants
Noise
• Noise creates aperiodic sounds
• aspiration (generated at the glottis)
• Noise
friction (generated elsewhere)

A speech sound may come from two sources like


for the voiced fricatives both voicing and
friction are sources.
Filter
• An acoustic filter modifies the fundamental
frequency produced by the source
• An acoustic filter is a device which passes
certain frequencies and attenuates others
• An important characteristic of filter is its
transfer function: the ratio of the output to
input frequency
Spectrogram

• A spectrogram is a visual representation of a


speech sound in term of an acoustic signal.
• The vertical axis of a spectrogram shows the
frequency and the horizontal axis represents time
in milliseconds. The energy concentration or
amplitude of a sound is denoted by the dark
bands on the spectrogram.
• Spectrograms are the most useful representation
of speech sounds for speech analysis as well as
synthesis.
Wide-band and narrow-band
spectrograms
• A spectrogram produced using an analysis scheme
which emphasizes temporal changes in the signal: with
short-time spectrum calculations (about 3ms) or highly
damped analysis filters (about 300Hz) is known as a
wide-band spectrogram. We often use this in the
speech analysis.
• Narrow-band spectrogram: A spectrogram produced
using an analysis scheme which emphasizes frequency
changes in the signal: with long-time spectrum
calculations (about 20ms) or lightly damped analysis
filters (about 45Hz).
Formants
• A formant is a dark band on a wide band
spectrogram, which corresponds to a vocal
tract resonance. Technically, it represents a
set of adjacent harmonics.
• F1 rule of vowels: F1 is inversely related to the
height of the tongue.
• F2 rule of the vowels: F2 is directly related to
the frontness of the tongue.
• All formants are lowered by the lip rounding.
Spectrograms of vowels
• https://youtu.be/MyNrmiJQ4dI

• In the video see the spectrograms of high and


low cardinal vowels
• If you plot them in a graph showing their F1
and F2 on the vertical and the horizontal axis,
we get a picture similar to the chart of
cardinal vowels based on the articulation of
the vowels.
Reading Spectrogram of a stop
• Stops are acoustically characterized by the release
of a burst after the complete closure. The closure
is indicated a by a gap in the spectrogram.
• Voicing starts after the burst for producing the
following vowel. For the voiced stops, before the
release of the burst a voice bar can be seen
parallel to the horizontal line.
• The time between the burst and the starting of
voicing is known as Voice Onset Time (VOT).
Comparing Voiced and Voiceless stops
• VOT is less than 30 ms for a voiced stop and
more than 50 ms for a voiceless sounds.
• Thus, voiced stops are characterized by –VOT
and voiceless stops are characterized by +VOT.
• Voiced stops have a voice bar during closure
but voiceless stops do not have a voice bar.
Nasals

• Nasals are acoustically characterized by a


formant structure like the vowels and the
lower than vowels amplitude. The resulting
formants are fainter than the vowels.
• The first formant occurs at a much lower
frequency.
• There is a large region with zero energy after
the first formant.
Fricatives
• Fricatives are produced with the acoustic source
of noise due to friction or turbulence during their
production.
• The noise is acoustically present as concentration
of energy as vertical bar in the spectrogram. The
random cluster of energy occurs in different
frequencies for different fricatives. E.g. it is much
lower for the dental and labio-velar fricatives,
higher for the palato-alveolar and quite high for
the alveolar fricatives.
References
• For knowing and reading spectograms go through these pages
• https://home.cc.umanitoba.ca/~krussll/phone
tics/acoustic/spectrogram-sounds.html#:~:tex
t=Fricatives,see%20on%20a%20TV%20screen.
• http://www.linguisticsnetwork.com/phonetics
-the-basics-about-acoustic-features-of-conson
ants-in-standard-english/
• https://home.cc.umanitoba.ca/~robh/

You might also like