0% found this document useful (0 votes)
65 views5 pages

Chapter I. Introduction 1-13

This document provides a table of contents for a thesis on speaker recognition. The table of contents shows that the thesis includes 6 chapters which cover an introduction, literature review, feature extraction methods, speaker modeling techniques, dimensionality reduction methods, and conclusions. It also lists appendices and references. The chapters describe fundamental concepts in speaker recognition, a review of existing approaches, the author's proposed methods for feature extraction, modeling speakers using neural networks and support vector machines, using genetic algorithms for dimensionality reduction, and performance analysis.

Uploaded by

Abdelkbir Ws
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views5 pages

Chapter I. Introduction 1-13

This document provides a table of contents for a thesis on speaker recognition. The table of contents shows that the thesis includes 6 chapters which cover an introduction, literature review, feature extraction methods, speaker modeling techniques, dimensionality reduction methods, and conclusions. It also lists appendices and references. The chapters describe fundamental concepts in speaker recognition, a review of existing approaches, the author's proposed methods for feature extraction, modeling speakers using neural networks and support vector machines, using genetic algorithms for dimensionality reduction, and performance analysis.

Uploaded by

Abdelkbir Ws
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

TABLE OF CONTENTS

Declaration i
Certificate of the Supervisor ii
Acknowledgements iii
List of Publications iv
Abstract v
Table of Contents vii
List of Tables xii
List of Figures xiv
List of Abbreviations xvii

CHAPTER I. INTRODUCTION 1-13


1.1 Fundamentals of Speaker Recognition 1
1.2 Applications 7
1.3 Historical Achievements in Speaker Recognition Technology 8
1.4 Challenges to the Speaker Recognition System 9
1.5 Motivation 10
1.6 Problem Formulation 11
1.7 Objectives of Research 11
1.8 Organization of Thesis 12

CHAPTER II. LITERATURE REVIEW 14- 43


2.1 Introduction 14
2.1.1 Speech Production Mechanism in Human Beings 15
2.1.2 Source Filter Model of Speech Production 17
2.1.3 Short Term Analysis of Speech Signal 19
2.2 Basic Structure of Speaker Recognition System 19
2.3 Voice Activity Detection 22
2.4 Feature Extraction Methods used in Speaker Recognition 23

vii
CONTENTS Page No.

2.4.1 Spectral Features 24


2.4.2 Dynamic Features 25
2.4.3 Prosodic Features 26
2.4.4 High-level Features 27
2.5 Speaker Modeling - Classical Approaches 27
2.5.1 Template Models 28
2.5.2 VQ Source Modeling 29
2.5.3 Hidden Markov Model 30
2.5.4 Neural Networks 31
2.5.5 Support Vector Machines 32
2.5.6 Gaussian Mixture Models 32
2.6 Dimensionality Reduction Techniques 35
2.7 Performance Terms for Speaker Recognition Task 36
2.8 Gaps in the Study 41
2.9 Conclusions 42

CHAPTER III. FEATURE EXTRACTION 44-76


3.1 Introduction 44
3.2 Pre-processing 47
3.2.1 Pre-emphasis 48
3.2.2 Voice Activity Detection 49
3.3 Proposed Method of Voice Activity Detection 52
3.4 Mel Frequency Cepstral Coefficients 54
3.4.1 Frame Blocking 55
3.4.2 Windowing 56
3.4.3 Short Term Fast Fourier Transform 57
3.4.4 Mel-Frequency Warping 57
3.4.5 Log Compression and Discrete Cosine Transform 59
3.4.6 Delta and Delta-Delta Coefficients 60

viii
CONTENTS Page No.

3.5 Simulation 62
3.5.1 Voice Activity Detection 63
3.5.2 MFCC 63
3.6 Feature Extraction using MFCC and its Derivatives 65
3.6.1 Number of filters in the filter bank vs. Identification Rate 65
3.6.2 Effect of variation in Type of Window 66
3.6.3 Effect of Adding Derivatives 67
3.7 Effect of VAD on Speaker Recognition Rate 69
3.8 Factors affecting MFCC performance 71
3.9 Conclusions 75

CHAPTER IV. SPEAKER MODELING 77-109


4.1 The Neural Network 77
4.2 Network Structures 80
4.3 Training of Artificial Neural Networks 82
4.4 Implementation of the Speaker Recognition System using Back 86
Propagation Algorithm
4.5 Support Vector Machines 89
4.6 SVM Classification Mechanism 91
4.6.1 Linear Separable Case 91
4.6.2 Linear Non-separable Case 94
4.6.3 Nonlinear Case 95
4.7 Implementation of the Speaker Recognition System using SVM 97
4.8 Performance of the Speaker Recognition System 100
4.8.1 Performance of the Speaker Identification System in Presence of 100
Noise
4.8.2 Relative Performance of SVM and Neural Network in a Speaker 102
Recognition System

ix
CONTENTS Page No.

4.9 Real Time Speaker Recognition System for Hindi Words 103
4.9.1 Methodology 104
4.9.2 Graphical User Interface (GUI) for Real Time Speaker 106
Recognition
4.9.3 Display on LCD 108
4.10 Conclusions 109

CHAPTER V. DIMENSIONALITY REDUCTION OF FEATURE 110-128


VECTORS
5.1 Introduction 110
5.2 Genetic Algorithms 113
5.3 Feature Selection using GA 116
5.4 Performance of the Speaker Recognition System using GA 117
5.4.1 Effect of Noise on Speaker Recognition Rate 119
5.4.2 Processing Time 121
5.4.3 Effect of Number of Utterances per Speaker on Recognition Rate 122
5.4.4 Relative Performance of GA and PCA in a Speaker 123
Recognition System
5.4.5 Performance of GA with Different Kernel Functions of SVM 125
using Reduced Dimensional Feature Vectors
5.5 Conclusions 127

CHAPTER VI. CONCLUSIONS AND FUTURE WORK 129-135


6.1 Introduction 129
6.2 Summary and Findings 130
6.3 Future Scope 135

x
CONTENTS Page No.

APPENDICES
A. Voicebox 136
B. Description of Speaker Databases 137

REFERENCES 139

BRIEF PROFILE OF THE RESEARCH SCHOLAR 151

xi

You might also like