Unit 2 Sound or Audio System

The document outlines the fundamentals of sound and audio systems, including the definition of sound, its properties such as frequency and amplitude, and the science of acoustics. It discusses speech generation techniques, including concatenative synthesis, parametric synthesis, and neural network-based models, highlighting advancements in natural-sounding speech. Additionally, it covers speech analysis, transmission methods, and the importance of efficient coding for maintaining sound quality during transmission.

Uploaded by

Arun shrestha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views29 pages

Unit 2 Sound or Audio System

Uploaded by

Arun shrestha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Unit 2

Sound/Audio System
syllabus
Concept of sound system, Music and speech, speech analysis, speech
transformation

Prepared by: Er. Hemanta Bohora

Introduction
Sound is a type of energy that travels through matter in waves, usually through air, but
also through liquids and solids. At its core, sound is produced when an object vibrates,
causing the surrounding molecules to vibrate as well. This vibration travels in waves until
it reaches a listener's ear, where it’s perceived as sound.
• Definition: Sound is energy that moves in waves through a medium like air, water, or solids.
• Frequency: Determines pitch; higher frequency means a higher pitch, and lower frequency
means a lower pitch
• Creation: It’s generated by vibrations in an object, which cause surrounding molecules to vibrate
and transmit the sound wave.
• Amplitude: Relates to volume; higher amplitude results in louder sounds, while lower amplitude
results in quieter sounds.
• Human Perception: Sound waves reach the ear, causing the eardrum to vibrate, which the
brain interprets as sound.
• Medium Requirement: Sound needs a medium (like air or water) to travel; it cannot move
through a vacuum.
• Field of Study (Acoustics): Acoustics is the science of sound, exploring its production,
transmission, and effects in various applications.
.
.
.
Speech Generation
• Speech Generation is the process of creating spoken language from
text or other forms of input.
• It encompasses technologies like text-to-speech (TTS) systems,
conversational AI, and voice cloning, and is critical in applications such
as virtual assistants, accessibility tools, and robotics.
• The field has seen significant advancements with the introduction of
neural network-based models like WaveNet and Tacotron, which
produce highly natural, expressive speech.
• Challenges remain in making speech generation more natural, context-
sensitive, and scalable, especially for multilingual or emotionally varied
speech. Despite these challenges, speech generation continues to
improve, enhancing human-computer interaction and accessibility
across numerous industries
Techniques Used in Speech Generation
1.Concatenative Synthesis:
1. A popular method for generating speech where prerecorded human speech segments (like
syllables or phonemes) are stitched together to form sentences.
2. Limitations: While this approach can produce highly natural-sounding speech, it requires a
large database of recorded voice data and may sound robotic if not done correctly.
2.Parametric Synthesis:
1. Speech generation models that create audio waves based on parameters like pitch,
duration, and voice quality. Formant synthesis and HMM-based synthesis are examples of
parametric synthesis methods.
2. Limitations: Though efficient, the speech generated is often less natural-sounding
compared to concatenative synthesis.
3.Neural Network-Based Models:
1. WaveNet (by DeepMind): A deep neural network that directly generates raw audio waves, achieving much
higher naturalness in speech than traditional methods.
2. Tacotron and Tacotron 2: Text-to-speech systems that convert text into spectrograms, which are then turned
into speech waveforms using another model (such as a vocoder).
3. Prosody Prediction: Neural networks are also employed to predict and generate natural-sounding prosody
(intonation, pitch, rhythm) in generated speech.
4. End-to-End Models:
1. Recent advances involve end-to-end neural models that can directly generate
speech from text without the need for separate steps like phonetic transcription,
linguistic analysis, or waveform generation. These models learn the entire process
in one unified pipeline.
2. Example: FastSpeech and FastSpeech 2 are end-to-end systems that produce
speech with high efficiency and naturalness.
5. Voice Cloning:
1. Voice cloning involves generating speech that mimics a specific person’s voice using
a limited amount of recorded data. This is typically done through deep learning
models that capture the unique characteristics of a person’s voice and speech
patterns.
2. Applications: Personalized virtual assistants, content creation, and accessibility
tools.
Basic Notations:
- The lowest periodic spectral component of the speech signal is called the
fundamental frequency. It is presented in a voiced sound.
- A phone is the smallest speech unit, such as the m of mat and b of bat in
English.
- Allophones mark the variants of phone. For example, the aspirated p of pit and
the unaspirated p of spit are allophones of the English phenome P.
- The morph marks the smallest unit which carries a meaning itself.
- A voiced sound is generated through the vocal cords; m, v and l are examples
of voiced sound. The pronunciation of a voiced depends strongly on each
speaker.
- During the generation of unvoiced sound, the vocal cords are opened f and s is
unvoiced sound.
- Vowels – a speech created by the relatively free passage of breath through the
larynx and oral charity. Example, a, e, I, o and u
- Consonants – a speech sound produced by a partial or complete obstruction of
the air stream by any of the various contradictions of the speech organs.
Example, m from mother, ch from chew.
Speech Analysis:
Speech analysis/input deals with the research areas which are as follows:

(1) Who?
-Human speech has certain characteristics determined by a speaker. Hence
speech analysis can serve to analyze who is speaking i.e. to recognize a
speaker for his/her identification and verification.
(2) What?
- Another main task of speech analysis is to analyze what has been said i.e. to
recognize and understand the speech signal itself.
(3) How?
Another area of speech analysis tries to research speech patterns with respect
to how a certain statement was said.
Figure: - Speech recognition system: task division into system components, using the
basic principle “Data Reduction through Property Extraction”

Speech Transmission:
- The area of speech transmission deals with efficient coding of the speech
signal allow speech / sound transmission at low transmission rates over
networks.
- The goal is to provide the receiver with the same speech/sound quality as was
generated at the sender side.
Some Techniques for Speech Transmission:
(1) Pulse Code Modulation:
A straight forward technique for digitizing an analog signal is pulse code
modulation. It meets the right quality demand stereo audio signals in the data rate
used for CD. Its rate is 176400 bytes/s.
(2) Source Encoding:

Figure: - Component of a speech transmission system using source encoding

In source encoding transmission depends on the original signal has certain characteristics that can
be exploited in compression.
(3) Recognition-Synthesis Method:

Figure: - Component of a recognition Synthesis for speech transmission

This method conducts a speech analysis and speech synthesis during reconstruction speech
elements are characterized by bits and transmitted over multimedia system. The data rate defines
the quality.
Example:
Calculate the file size in bytes for 60 second recording at 44.1 KHz, 8 bits resolution stereo
sound.
Sound Types and their number of Channel

Introduction To Digital Speech Processing
No ratings yet
Introduction To Digital Speech Processing
42 pages
Chapter 2
No ratings yet
Chapter 2
29 pages
Synopsis
No ratings yet
Synopsis
11 pages
Speechsynthesis
No ratings yet
Speechsynthesis
6 pages
Synthesis: Models of Speech
No ratings yet
Synthesis: Models of Speech
6 pages
Unit 4 Ppttsa
No ratings yet
Unit 4 Ppttsa
19 pages
Text To Speech Conversion: Muhammad Amar (19L-1916)
No ratings yet
Text To Speech Conversion: Muhammad Amar (19L-1916)
4 pages
Speech Recognition Full Report
No ratings yet
Speech Recognition Full Report
11 pages
Digital Speech Processing
No ratings yet
Digital Speech Processing
7 pages
Final PPT On Speech Processing
50% (2)
Final PPT On Speech Processing
20 pages
Festival Hindi Pxc3893287
No ratings yet
Festival Hindi Pxc3893287
6 pages
Articles: Speech Synthesis 1 Prosody (Linguistics) 11 Tone (Linguistics) 13
No ratings yet
Articles: Speech Synthesis 1 Prosody (Linguistics) 11 Tone (Linguistics) 13
26 pages
The Main Principles of Text-to-Speech Synthesis System: January 2010
No ratings yet
The Main Principles of Text-to-Speech Synthesis System: January 2010
8 pages
Question
100% (1)
Question
17 pages
Arabic Text To Speech Synthesizer
No ratings yet
Arabic Text To Speech Synthesizer
14 pages
Voice Cloning
No ratings yet
Voice Cloning
4 pages
Speech Synthesis - Christopher Mwololo Fred
No ratings yet
Speech Synthesis - Christopher Mwololo Fred
18 pages
TTS System Attributes & Analysis
No ratings yet
TTS System Attributes & Analysis
3 pages
Test 1
No ratings yet
Test 1
77 pages
Text To Speech Synthesis TTS
No ratings yet
Text To Speech Synthesis TTS
7 pages
Marathi Speech Synthesis A Review
No ratings yet
Marathi Speech Synthesis A Review
4 pages
Phonetics 2
No ratings yet
Phonetics 2
14 pages
Speech Technology
No ratings yet
Speech Technology
5 pages
Text-to-Speech (TTS) System
No ratings yet
Text-to-Speech (TTS) System
11 pages
Bhaashika: Telugu Tts System: Dr. K.V.N.Sunitha
No ratings yet
Bhaashika: Telugu Tts System: Dr. K.V.N.Sunitha
9 pages
HG3052 CourseOutline SpeechSynthesisRecognition AY2019-20 SEM1 Update Sep10
No ratings yet
HG3052 CourseOutline SpeechSynthesisRecognition AY2019-20 SEM1 Update Sep10
6 pages
Ijisr 15 139 02 PDF
No ratings yet
Ijisr 15 139 02 PDF
7 pages
Artificial Intelligence-An Introduction: Department of Computer Science & Engineering
No ratings yet
Artificial Intelligence-An Introduction: Department of Computer Science & Engineering
17 pages
Voice Digitization and On
No ratings yet
Voice Digitization and On
2 pages
TTS Tech Review for Researchers
No ratings yet
TTS Tech Review for Researchers
4 pages
Method To Study Speech Synthesis
No ratings yet
Method To Study Speech Synthesis
43 pages
Major Project - I Final Submission Report: DSP Tools in Wireless Communication
No ratings yet
Major Project - I Final Submission Report: DSP Tools in Wireless Communication
36 pages
Speech Processing
No ratings yet
Speech Processing
70 pages
Speech Recognition - Specific Task of Speech Recognition: Abstract
No ratings yet
Speech Recognition - Specific Task of Speech Recognition: Abstract
7 pages
Chapter-3: Theory of TTS
No ratings yet
Chapter-3: Theory of TTS
26 pages
Speech and Audio Processing and Coding
No ratings yet
Speech and Audio Processing and Coding
52 pages
Speech Synthesis
No ratings yet
Speech Synthesis
8 pages
TTS SRM Speech
No ratings yet
TTS SRM Speech
38 pages
EEE 6211 Digital Speech Processing: Course Instructor Dr. Mohammad Ariful Haque Professor, Dept. of EEE, BUET
No ratings yet
EEE 6211 Digital Speech Processing: Course Instructor Dr. Mohammad Ariful Haque Professor, Dept. of EEE, BUET
16 pages
Grapheme To Phoneme Rules For Text To Speech Synthesis in Malayalam 27 MARCH 17
100% (1)
Grapheme To Phoneme Rules For Text To Speech Synthesis in Malayalam 27 MARCH 17
7 pages
U 4
No ratings yet
U 4
8 pages
SP - 3301PPT
No ratings yet
SP - 3301PPT
152 pages
Unit 5 Speech Processing
No ratings yet
Unit 5 Speech Processing
8 pages
Introduction To Linguistics 14
No ratings yet
Introduction To Linguistics 14
27 pages
Speech Processing: Review # (Or) Seminar #
No ratings yet
Speech Processing: Review # (Or) Seminar #
49 pages
Digital Speech Signal Processing Overview
No ratings yet
Digital Speech Signal Processing Overview
7 pages
Human Computer Interfacing'
100% (1)
Human Computer Interfacing'
10 pages
Keller 01 Naturalness
No ratings yet
Keller 01 Naturalness
12 pages
Digital Speech Processing - Lecture 1
No ratings yet
Digital Speech Processing - Lecture 1
39 pages
Rapha Dauda One To Five - 043847
No ratings yet
Rapha Dauda One To Five - 043847
41 pages
Rapha Dauda Chapter One To Four - 034731
No ratings yet
Rapha Dauda Chapter One To Four - 034731
40 pages
Human-Robot Communication: Supervisor: Prof. Nejat Biomechantronics Lab Progress Report
No ratings yet
Human-Robot Communication: Supervisor: Prof. Nejat Biomechantronics Lab Progress Report
23 pages
1709 07552 PDF
No ratings yet
1709 07552 PDF
138 pages
Speech Perception and Its Disorders
No ratings yet
Speech Perception and Its Disorders
34 pages
Text To Speech Synthesis 1st Edition Paul Taylor PDF Available
100% (2)
Text To Speech Synthesis 1st Edition Paul Taylor PDF Available
100 pages
PL Features Parameters
No ratings yet
PL Features Parameters
13 pages
Speech Synthesis
No ratings yet
Speech Synthesis
4 pages
SP Assign - 2
No ratings yet
SP Assign - 2
9 pages
Unit V Application
No ratings yet
Unit V Application
13 pages
Art Exam Review for Students
No ratings yet
Art Exam Review for Students
12 pages
Vasilissa Ergo Gaude
No ratings yet
Vasilissa Ergo Gaude
3 pages
Métodos de Violín y Viola
No ratings yet
Métodos de Violín y Viola
20 pages
History of Music
No ratings yet
History of Music
37 pages
Past Tense Vs Past Participle
No ratings yet
Past Tense Vs Past Participle
12 pages
Keyboard Lessons For Beginners
50% (2)
Keyboard Lessons For Beginners
220 pages
EGNOS Ground Segment
No ratings yet
EGNOS Ground Segment
6 pages
Application Manual: English
No ratings yet
Application Manual: English
20 pages
Fire Force OP - Inferno
No ratings yet
Fire Force OP - Inferno
4 pages
日向敏文 - Reflections: Moderato Andantino
No ratings yet
日向敏文 - Reflections: Moderato Andantino
3 pages
UK Singer-Songwriter Venus Releases New Album VenusWorld
No ratings yet
UK Singer-Songwriter Venus Releases New Album VenusWorld
3 pages
Traditional Filipino Artisans
No ratings yet
Traditional Filipino Artisans
7 pages
Basic Radar Principles
No ratings yet
Basic Radar Principles
39 pages
PRIMOR DE CHOLA 2 - Classical Guitar 2
No ratings yet
PRIMOR DE CHOLA 2 - Classical Guitar 2
2 pages
Chords For Coros Unidos Chords For Coros Unidos Yo He Creido Yo He Creido Con Letra Con Letra
No ratings yet
Chords For Coros Unidos Chords For Coros Unidos Yo He Creido Yo He Creido Con Letra Con Letra
3 pages
Octatonic Scales for Musicians
100% (1)
Octatonic Scales for Musicians
1 page
Everybody Wants To Be A Cat PDF
No ratings yet
Everybody Wants To Be A Cat PDF
2 pages
Academic Podcasting
No ratings yet
Academic Podcasting
31 pages
ANT AQU4518R23V06 Datasheet
No ratings yet
ANT AQU4518R23V06 Datasheet
2 pages
Bananeira - BB
No ratings yet
Bananeira - BB
1 page
Josh
No ratings yet
Josh
53 pages
EC2 AC Lab - EXP - 10 Analysis of Various Passive Filters
No ratings yet
EC2 AC Lab - EXP - 10 Analysis of Various Passive Filters
5 pages
Fcps Elementary Music Teacher Cover Letter April 2020
No ratings yet
Fcps Elementary Music Teacher Cover Letter April 2020
1 page
Ghosh, Sasthi C. - Sinha, Bhabani P. - Sinha, Koushik - Wireless Networks and Mobile Computing-CRC Press (2016)
No ratings yet
Ghosh, Sasthi C. - Sinha, Bhabani P. - Sinha, Koushik - Wireless Networks and Mobile Computing-CRC Press (2016)
540 pages
Present Simple Present Continuous - 4th Years 2nd Period PDF
No ratings yet
Present Simple Present Continuous - 4th Years 2nd Period PDF
4 pages
La Consentida Cueca Chilena Letra
No ratings yet
La Consentida Cueca Chilena Letra
1 page
20th Century Electronic & Chance Music
No ratings yet
20th Century Electronic & Chance Music
25 pages
Defence Robot Using Arm 7 Lpc2148
No ratings yet
Defence Robot Using Arm 7 Lpc2148
10 pages
Christmas Song Chords Collection
No ratings yet
Christmas Song Chords Collection
9 pages
Full Download Satellite Newsgathering Second Edition Jonathan Higgins PDF
100% (7)
Full Download Satellite Newsgathering Second Edition Jonathan Higgins PDF
67 pages

Unit 2 Sound or Audio System

Uploaded by

Unit 2 Sound or Audio System

Uploaded by

Unit 2

Prepared by: Er. Hemanta Bohora

Figure: - Component of a speech transmission system using source encoding

Figure: - Component of a recognition Synthesis for speech transmission

You might also like