0% found this document useful (0 votes)

29 views6 pages

103 359 1 PB

Uploaded by

Raymond Themi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views6 pages

103 359 1 PB

Uploaded by

Raymond Themi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/304651244

VOICE RECOGNITION SYSTEM: SPEECH-TO-TEXT

Article in Journal of Applied and Fundamental Sciences · November 2015

CITATIONS READS

33 76,735

4 authors, including:

Pranab Das Vijay Prasad

Assam Don Bosco University Assam Don Bosco University
16 PUBLICATIONS 49 CITATIONS 8 PUBLICATIONS 97 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Vijay Prasad on 01 July 2016.

The user has requested enhancement of the downloaded file.

Journal of Applied and Fundamental Sciences

VOICE RECOGNITION SYSTEM: SPEECH-TO-TEXT

Prerana Das, Kakali Acharjee, Pranab Das and Vijay Prasad*
Department of Computer Science & Engineering and Information Technology, School of Technology, Assam
Don Bosco University, Assam, India
*For correspondence. (vpd.vijay82@gmail.co)

Abstract: VOICE RECOGNITION SYSTEM:SPEECH-TO-TEXT is a software that lets the user control
computer functions and dictates text by voice. The system consists of two components , first component is for
processing acoustic signal which is captured by a microphone and second component is to interpret the
processed signal, then mapping of the signal to words. Model for each letter will be built using Hidden Markov
Model(HMM). Feature extraction will be done using Mel Frequency Cepstral Coefficients(MFCC). Feature
training of the dataset will be done using vector quantization and Feature testing of the dataset will be done
using viterbi algorithm. Home automation will be completely based on voice recognition system.

Keywords: Voice recognition, MFCC, HMM, Vector quantization, Viterbi algorithm, Feature extraction

1. Introduction:

Voice is the basic, common and efficient form of communication method for people to interact with each other.
Today speech technologies are commonly available for a limited but interesting range of task. This technologies
enable machines to respond correctly and reliably to human voices and provide useful and valuable services.
As communicating with computer is faster using voice rather than using keyboard, so people will prefer such
system. Communication among the human being is dominated by spoken language, therefore it is natural for
people to expect voice interfaces with computer.

This can be accomplished by developing voice recognition system:speech-to-text which allows computer to
translate voice request and dictation into text. Voice recognition system:speech-to-text is the process of
converting an acoustic signal which is captured using a microphone to a set of words. The recorded data can be
used for document preparation.

2. Classification of speech recognition system:

Speech recognition system can be classified in several different types by describing the type of speech utterance,
type of speaker model and type of vocability that they have the ability to recognize. The challenges are briefly
explained below:

A. Types of speech utterance

Speech recognition are classified according to what type of utterance they have ability to recognize. They are
classified as:
1) Isolated word: Isolated word recognizer usually requires each spoken word to have quiet (lack of an audio
signal) on bot
h side of the sample window. It accepts single word at a time.
2) Connected word: It is similar to isolated word, but it allows separate utterances to „run-together‟ which
contains a minimum pause in between them.
3) Continuous Speech: it allows the users to speak naturally and in parallel the computer will determine the
content.
4) Spontaneous Speech: It is the type of speech which is natural sounding and is not rehearsed.

B. Types of speaker model

Speech recognition system is broadly into two main categories based on speaker models namely speaker
dependent and speaker independent.

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 191

Journal of Applied and Fundamental Sciences

1) Speaker dependent models: These systems are designed for a specific speaker. They are easier to develop
and more accurate but they are not so flexible.
2) Speaker independent models: These systems are designed for variety of speaker. These systems are difficult
to develop and less accurate but they are very much flexible.

C. Types of vocabulary

The vocabulary size of speech recognition system affects the processing requirements, accuracy and
complexity of the system. In voice recognition system:speech-to-text the types of vocabularies can be classified
as follows:
1) Small vocabulary: single letter.
2) Medium vocabulary: two or three letter words.
3) Large vocabulary: more letter words.

3. Survey of research papers:

Kuldip K. Paliwal and et al in the year 2004 had discussed that without being affected by their popularity for
front end parameter in speech recognition, the cepstral coefficients which had been obtained from linear
prediction analysis is sensitive to noise. Here, the use of spectral subband centroids had been discussed by them
for robust speech recognition. They discussed that performance of recognition can be achieved if the centroids
are selected properly as in comparison with MFCC. to construct a dynamic centroid feature vector a procedure
had been proposed which essentially includes the information of transitional spectral information [1].

Esfandier Zavarehei and et al in the year 2005, studied that a time-frequency estimator for enhancement of noisy
speech signal in DFT domain is introduced. It is based on low order auto regressive process which is used for
modelling. The time-varying trajectory of DFT component in speech which has been formed in Kalman filter
state equation. For restarting Kalman filter, a method has been formed to make alteration on the onsets of
speech. The performance of this method was compared with parametric spectral substraction and MMSE
estimator for the increment of noisy speech. The resultant of the proposed method is that residual noise is
reduced and quality of speech in improved using Kalman filters [2].

Ibrahim Patel and et al in the year 2010, had discussed that frequency spectral information with mel frequency is
used to present as an approach in the recognition of speech for improvement of speech, based on recognition
approach which is represented in HMM. A combination of frequency spectral information in the conventional
Mel spectrum which is based on the approach of speech recognition. The approach of Mel frequency utilize the
frequency observation in speech within a given resolution resulting in the overlapping of resolution feature
which results in the limit of recognition. In speech recognition system which is based on HMM, resolution
decomposition is used with a mapping approach in a separating frequency. The result of the study is that there is
an improvement in quality metrics of speech recognition with respect to the computational time and learning
accuracy in speech recognition system[6].

Kavita Sharma and Prateek Hakar in the year 2012 has represented recognition of speech in a broader
solutions. It refers to the technology that will recognize the speech without being targeted at single speaker.
Variability in speech pattern, in speech recognition is the main problem. Speaker characteristics which include
accent, noise and co-articulation are the most challenging sources in the variation of speech. In speech
recognition system, the function of basilar membrane is copied in the front-end of the filter bank. To obtain
better recognition results it is believed that the band subdivision is closer to the human perception. In speech
recognition system the filter which is constructed for speech recognition is estimated of noise and clean
speech[10].

Puneet Kaur, Bhupender Singh and Neha Kapur in the year 2012 had discussed how to use Hidden Markov
Model in the process of recognition of speech. To develop an ASR(Automatic Speech Recognition) system the
essential three steps necessary are pre-processing, feature Extraction and recognition and finally hidden markov
model is used to get the desired result. Research persons are continuously trying to develop a perfect ASR
system as there are already huge advancements in the field of digital signal processing but at the same time
performance of the computer are not so high in this field in terms of speed of response and matching accuracy.
The three different technique used by research fellows are acoustic phonetic approach, pattern recognition
approach and knowledge based approach[4].

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 192

Journal of Applied and Fundamental Sciences

Chadawan Ittichaichareon and Patiyuth Pramkeaw in the year 2012 had discussed that signal processing toolbox
has been used in order to implement the low pass filter with finite impulse response. Computational
implementation and analytical design of finite impulse response filter has been successfully accomplished by
performing the performance evaluation at signal to noise ratio level. The results are improved in terms of
recognition when low pass filters is used as compared to those process which involves speech signal without
filtering[3].

Geeta Nijhawan, Poonam Pandit and Shivanker Dev Dhingra in the year 2013 had discussed the techniques of
dynamic time warping and mel scale frequency cepstral coefficient in the isolated speech recognition. Different
features of the spoken word had been extracted from the input speech. A sample of 5 speakers has been
collected and each had spoken 10 digits. A database is made on this basis. Then feature has been extracted using
MFCC.DTW is used for effectively dealing with various speaking speed. It is used for similarity measurement
between two sequence which varies in speed and time[5].

4. Table of comparison:

Table 1: Table of comparison.

Author(s) Year Paper name Technique Results
Kuldip K. Paliwal 2004 Recognition of Use of spectral It showed that the
Noisy Speech Using subband Centroids new dynamic SSC
Dynamic Spectral coefficients are
Subband Centroids more resilient to
noise than the
MFCC features.
Esfandier 2005 Speech Concept sequence Increase the
Zavarehei Enhancement using modelling, two-level semantic
Kalman filters for semantic-lexical information utilized
Restoration of short- modelling, and joint and tightness of
time DFT semantic-lexical integration between
trajectories modelling lexical and semantic
items
Ibrahim Patel 2010 Speech Recognition Resolution It show an
Using HMM with Decomposition with improvement in the
MFCC-an analysis Separating quality metrics of
using Frequency Frequency is the speech recognition
Spectral mapping approach with respect to
Decomposition computational time,
Technique learning accuracy
for a speech
recognition system
Kavita Sharma 2012 Speech Denoising FIR, IIR, Use of filter shows
using Different WAVELETS, that estimation of
Types of Filters FILTER clean speech and
noise for speech
enhancement in
speech recognition
Bhupinder Singh 2012 Speech Recognition Hidden Markov Develop a voice
with Hidden Markov Model based user machine
Model interface system
Patiyuth Pramkeaw 2012 Improving MFCC- FIR Filter Shows the
based speech improvement in
classification with recognition rates of
FIR filter spoken words
Shivanker Dev 2013 Isolated Speech Dynamic Time It shows that the
Dhingra Recognition using Warping(DTW) DTW is the best non
MFCC and DTW linear feature

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 193

Journal of Applied and Fundamental Sciences

matching technique
in speech
identification, with
minimal error rates
and fast computing
speed

5. Overview of voice recognition system:speech-to-text:

Figure 1: Overview of Voice Recognition System:Speech-to-text.

Input signal- Voice input by the user.

Feature Extraction- it should retain useful information of the signal, deduct redundant and unwanted
information, show less variation from one speaking environment to another, occur normally and naturally in
speech.
Acoustic model- it contains statistical representations of each distinct sounds that makes up a word.
Decoder- it will decode the input signal after feature extraction and will show the desired output.
Language model- it assigns a probability to a sequence of words by means of a probability distribution.
Output- interpreted text is given by the computer.

The main of the project is to recognize speech using MFCC and VQ techniques. The feature extraction will be
done using Mel Frequency Cepstral Coefficients(MFCC). The steps of MFCC are as follows:-
1) Framing and Blocking
2) Windowing
3) FFT(Fast Fourier Transform)
4) Mel-Scale
5) Discrete Cosine Transform(DCT)
Feature matching will be done using Vector Quantization technique. The steps are as follows:-
1) By choosing any two dimensions, inspection on vectors is done and data points are plotted.
2) To check whether data region for two different speaker are overlapping each other and in same cluster,
observation is needed.
3) Using LGB algorithm Function Vqlbg will train the VQ codebook.
The extracted features will be stored in .mat file using MFCC algorithm. Models will be created using Hidden
Markov Model(HMM). The desired output will be shown in matlab interface.

6. Conclusion:

In this paper the fundamentals are discussed and its recent progress is investigated. The various approaches
available for developing a Voice Recognition System based on adapted feature extraction technique and the
speech recognition approach for the particular language are compared in this paper. The main aim of our project
is to develop a system that will allow the computer to translate voice request and dictation into text using MFCC
and VQ techniques. Feature extraction and feature matching will be done using Mel Frequency Cepstral
Coefficients and Vector Quantization technique. The extracted feature will be stored in .mat file. A distortion
measure which is based on minimizing the Euclidean distance will be used while matching the unknown speech

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 194

Journal of Applied and Fundamental Sciences

signal with the database of the speech signal.In near future, home automation will be completely based on Voice
Recognition System.

Reference:

[1] Jingdong Chen, Member, Yiteng (Arden) Huang, Qi Li, Kuldip K. Paliwal, “Recognition of Noisy Speech
using Dynamic Spectral Subband Centroids” IEEE SSIGNAL PROCESSING LETTERS, Vol. 11, Number 2,
February 2004.
[2] Hakan Erdogan, Ruhi Sarikaya, Yuqing Gao, “Using semantic analysis to improve speech recognition
performance” Computer Speech and Language, ELSEVIER 2005.
[3] Chadawan Ittichaichareon, Patiyuth Pramkeaw, “Improving MFCC-based Speech Classification with FIR
Filter” International Conference on Computer Graphics, Simulation and Modelling (ICGSM‟2012) July 28-29,
2012 Pattaya(Thailand).
[4] Bhupinder Singh, Neha Kapur, Puneet Kaur “Sppech Recognition with Hidden Markov Model:A Review”
International Journal of Advanced Research in Computer and Software Engineering, Vol. 2, Issue 3, March
2012.
[5] Shivanker Dev Dhingra, Geeta Nijhawan, Poonam Pandit, “Isolated Speech Recognition using MFCC and
DTW” International Journal of Advance Research in Electrical, Electronics and Instrumentation Engineering,
Vol.2, Issue 8, August 2013.
[6] Ibrahim Patel, Dr. Y. Srinivas Rao, “Speech Recognition using HMM with MFCC-an analysis using
Frequency Spectral Decomposition Technique” Signal and Image Processing:An International Journal(SIPIJ),
Vol.1, Number.2, December 2010.
[7] Om Prakash Prabhakar, Navneet Kumar Sahu,”A Survey on Voice Command Recognition Technique”
International Journal of Advanced Research in Computer and Software Engineering, Vol 3,Issue 5,May 2013.
[8] M A Anusuya, “Speech recognition by Machine”, International Journal of Computer Science and
Information security, Vol. 6, number 3,2009.
[9] Sikha Gupta, Jafreezal Jaafar, Wan Fatimah wan Ahmad, Arpit Bansal, “Feature Extraction Using MFCC”
Signal & Image Processing:An International Journal, Vol 4, No. 4, August 2013.
[10] Kavita Sharma, Prateek Hakar “Speech Denoising Using Different Types of Filters” International journal of
Engineering Research and Applications Vol. 2, Issue 1, Jan-Feb 2012

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 195

View publication stats

Voice Recognition System: Speech-to-Text
No ratings yet
Voice Recognition System: Speech-to-Text
6 pages
Voice Recognition System: Speech-To-Text: Journal of Applied and Fundamental Sciences November 2015
No ratings yet
Voice Recognition System: Speech-To-Text: Journal of Applied and Fundamental Sciences November 2015
6 pages
Voice Recognition System: Speech-To-Text: Journal of Applied and Fundamental Sciences November 2015
No ratings yet
Voice Recognition System: Speech-To-Text: Journal of Applied and Fundamental Sciences November 2015
6 pages
Vivek Kumar - 1613112052
No ratings yet
Vivek Kumar - 1613112052
7 pages
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
No ratings yet
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
6 pages
Voice Recognition & Text-to-Speech
No ratings yet
Voice Recognition & Text-to-Speech
6 pages
Speech Recognition System - A Review
No ratings yet
Speech Recognition System - A Review
10 pages
Rohit
No ratings yet
Rohit
14 pages
Speech Recognition System - A Review: April 2016
No ratings yet
Speech Recognition System - A Review: April 2016
10 pages
KY DSV
No ratings yet
KY DSV
7 pages
A Report On
No ratings yet
A Report On
35 pages
Build Automatic Speech Recognition System: Bachelor of Technology
No ratings yet
Build Automatic Speech Recognition System: Bachelor of Technology
25 pages
Thesis-Speech Recognition Markov
No ratings yet
Thesis-Speech Recognition Markov
65 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Speech Recognition: BY Charu Joshi
100% (2)
Speech Recognition: BY Charu Joshi
26 pages
Personal Voice Assistant in Python
100% (1)
Personal Voice Assistant in Python
30 pages
Approved by AICTE, New Delhi Affiliated To Aryabhatta Knowledge University, Patna, BIHAR
No ratings yet
Approved by AICTE, New Delhi Affiliated To Aryabhatta Knowledge University, Patna, BIHAR
5 pages
Ai Project Sona-1 (1) - 250630 - 194118
No ratings yet
Ai Project Sona-1 (1) - 250630 - 194118
10 pages
1 Paper
No ratings yet
1 Paper
9 pages
25 The Comprehensive Analysis Speech Recognition System
No ratings yet
25 The Comprehensive Analysis Speech Recognition System
5 pages
Personal Voice Assistant in Python
86% (22)
Personal Voice Assistant in Python
30 pages
SPEECH
100% (1)
SPEECH
17 pages
Voice Recognition System Speech To Text
No ratings yet
Voice Recognition System Speech To Text
5 pages
A Voice Identification System Using Hidden Markov Model
No ratings yet
A Voice Identification System Using Hidden Markov Model
6 pages
Speech Recognition for Tech Enthusiasts
No ratings yet
Speech Recognition for Tech Enthusiasts
26 pages
The PC Interfaced Voice Recognition System Is To Implement A Password For Authentication
No ratings yet
The PC Interfaced Voice Recognition System Is To Implement A Password For Authentication
7 pages
Iccsee 2012 359
No ratings yet
Iccsee 2012 359
4 pages
Literature Review On Speech Recognition System
100% (3)
Literature Review On Speech Recognition System
4 pages
Synopsis
No ratings yet
Synopsis
5 pages
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
No ratings yet
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
6 pages
Voice Technology Seminar
100% (1)
Voice Technology Seminar
35 pages
Speech Recognition Full Report
No ratings yet
Speech Recognition Full Report
11 pages
Text and Speech CCS369-UNIT 5
No ratings yet
Text and Speech CCS369-UNIT 5
9 pages
Speech Recognition in AI (COMP 334)
No ratings yet
Speech Recognition in AI (COMP 334)
26 pages
Speech Recognition As Emerging Revolutionary Technology
No ratings yet
Speech Recognition As Emerging Revolutionary Technology
4 pages
NLP 1.3.1 - Speed Recogmnition
No ratings yet
NLP 1.3.1 - Speed Recogmnition
20 pages
History and Uses of Voice Recognition
No ratings yet
History and Uses of Voice Recognition
22 pages
Slidesgo Unlocking The Future The Impact of Voice Recognition Technology 202412160356347MWg
No ratings yet
Slidesgo Unlocking The Future The Impact of Voice Recognition Technology 202412160356347MWg
11 pages
Speech Recognition Using Neural Networks: A. Types of Speech Utterance
No ratings yet
Speech Recognition Using Neural Networks: A. Types of Speech Utterance
24 pages
Final Report
No ratings yet
Final Report
35 pages
Speech Recognition Course Guide
No ratings yet
Speech Recognition Course Guide
74 pages
A Review On Different Approaches For Speech - Recognition System
No ratings yet
A Review On Different Approaches For Speech - Recognition System
6 pages
Tan Pan Hassan VoiceRecognition
No ratings yet
Tan Pan Hassan VoiceRecognition
21 pages
A Review On Speech Recognition Challenge
No ratings yet
A Review On Speech Recognition Challenge
7 pages
Speech Recognition Using Python
100% (2)
Speech Recognition Using Python
6 pages
Python Speech Recognition Guide
No ratings yet
Python Speech Recognition Guide
18 pages
Speech Recognition Seminar Report
87% (97)
Speech Recognition Seminar Report
32 pages
Survey on Speech Recognition Systems
No ratings yet
Survey on Speech Recognition Systems
2 pages
Speech Recognition
0% (1)
Speech Recognition
27 pages
Araadhy Ayush
No ratings yet
Araadhy Ayush
22 pages
Speech Recognition System
No ratings yet
Speech Recognition System
5 pages
Artificial Intelligence in Voice Recognition
No ratings yet
Artificial Intelligence in Voice Recognition
14 pages
Ann LA2 Project
No ratings yet
Ann LA2 Project
23 pages
Voice Recognition: An Examination of An Evolving Technology and Its Use in Organizations
No ratings yet
Voice Recognition: An Examination of An Evolving Technology and Its Use in Organizations
8 pages
AJSAT Vol.5 No.2 July Dece 2016 pp.23 30
No ratings yet
AJSAT Vol.5 No.2 July Dece 2016 pp.23 30
9 pages
Tan Pan Hassan VoiceRecognition
No ratings yet
Tan Pan Hassan VoiceRecognition
21 pages
Seminar Presentation: Topic: Speech Recognition
No ratings yet
Seminar Presentation: Topic: Speech Recognition
26 pages
Deepfake Audio Detection Via MFCC Features Using Machine Learning
No ratings yet
Deepfake Audio Detection Via MFCC Features Using Machine Learning
11 pages
Deep Learning in NLP and Video
No ratings yet
Deep Learning in NLP and Video
12 pages
Audio Spoofing Verification Using Deep Convolutional Neural Networks by Transfer Learning
No ratings yet
Audio Spoofing Verification Using Deep Convolutional Neural Networks by Transfer Learning
5 pages
An Overview of Noise-Robust Automatic Speech Recognition
No ratings yet
An Overview of Noise-Robust Automatic Speech Recognition
33 pages
Research of Effective UAV Detection Using Acoustic Data Recognition
No ratings yet
Research of Effective UAV Detection Using Acoustic Data Recognition
91 pages
Text Independent Amharic Language Speaker Identifi
No ratings yet
Text Independent Amharic Language Speaker Identifi
6 pages
LLM4psych Multimodalities
No ratings yet
LLM4psych Multimodalities
31 pages
CNN Model for Speaker Identification
No ratings yet
CNN Model for Speaker Identification
14 pages
Unit 5 (Automatic Speech Recognition)
No ratings yet
Unit 5 (Automatic Speech Recognition)
13 pages
Attention-Based CRNN for Lung Disease Detection
No ratings yet
Attention-Based CRNN for Lung Disease Detection
7 pages
Comparing Recurrent Convolutional Neural Networks For Large Scale Bird Species Classification
No ratings yet
Comparing Recurrent Convolutional Neural Networks For Large Scale Bird Species Classification
12 pages
Niraj Resume 14-05-2023
No ratings yet
Niraj Resume 14-05-2023
3 pages
Classification of Marine Vessels Using Sonar-Data Neural Network
No ratings yet
Classification of Marine Vessels Using Sonar-Data Neural Network
62 pages
Speech and Computer 22nd International Conference SPECOM 2020 ST Petersburg Russia October 7 9 2020 Proceedings Alexey Karpov
100% (4)
Speech and Computer 22nd International Conference SPECOM 2020 ST Petersburg Russia October 7 9 2020 Proceedings Alexey Karpov
55 pages
Voice Emotion Recognition
No ratings yet
Voice Emotion Recognition
11 pages
Emotion Recognition Based On Speech Signals by Combining Empirical Mode Decomposition and Deep Neural Network
No ratings yet
Emotion Recognition Based On Speech Signals by Combining Empirical Mode Decomposition and Deep Neural Network
10 pages
Dialect Recognition System For Bagri Rajasthani Language Using Optimized Featured Swarm Convolutional Neural Network (Ofscnn) Model
No ratings yet
Dialect Recognition System For Bagri Rajasthani Language Using Optimized Featured Swarm Convolutional Neural Network (Ofscnn) Model
20 pages
2202 01986
No ratings yet
2202 01986
5 pages
Deepfake Audio Detection and Justification With Ex
No ratings yet
Deepfake Audio Detection and Justification With Ex
19 pages
Basic of Computer Vision UNIT II
No ratings yet
Basic of Computer Vision UNIT II
29 pages
Sustainability 16 10507
No ratings yet
Sustainability 16 10507
23 pages
Audio Data Analysis Using Machine Learning and Deep
No ratings yet
Audio Data Analysis Using Machine Learning and Deep
74 pages
Polyglot Review
No ratings yet
Polyglot Review
20 pages
Clap Based Fan Switching and Speed Control System Ijariie20292
No ratings yet
Clap Based Fan Switching and Speed Control System Ijariie20292
9 pages
Lecture Notes - Speech Processing
No ratings yet
Lecture Notes - Speech Processing
80 pages
IJISAE 3 Dr.+Shwetambari+Borade 3 1899
No ratings yet
IJISAE 3 Dr.+Shwetambari+Borade 3 1899
8 pages
Depression Detection Using Text Face and Audio
No ratings yet
Depression Detection Using Text Face and Audio
19 pages
A Study On Speech Emotion Recognition Based On MFCC and KNN Models
No ratings yet
A Study On Speech Emotion Recognition Based On MFCC and KNN Models
4 pages
Unit 2 - Speech and Video Processing (SVP) - 1
No ratings yet
Unit 2 - Speech and Video Processing (SVP) - 1
23 pages

103 359 1 PB

Uploaded by

103 359 1 PB

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

VOICE RECOGNITION SYSTEM: SPEECH-TO-TEXT

Article in Journal of Applied and Fundamental Sciences · November 2015

Pranab Das Vijay Prasad

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

VOICE RECOGNITION SYSTEM: SPEECH-TO-TEXT

2. Classification of speech recognition system:

A. Types of speech utterance

B. Types of speaker model

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 191

3. Survey of research papers:

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 192

Table 1: Table of comparison.

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 193

5. Overview of voice recognition system:speech-to-text:

Figure 1: Overview of Voice Recognition System:Speech-to-text.

Input signal- Voice input by the user.

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 194

JAFS|ISSN 2395-5554 (Print)|ISSN 2395-5562 (Online)|Vol 1(2)|November 2015 195

View publication stats

You might also like