0% found this document useful (0 votes)

20 views18 pages

Final Slide

The document presents a speech recognition system that aims to improve accuracy and accessibility across various languages and accents. It outlines the challenges faced by existing technologies, the objectives of the project, and the development methodology, including the use of algorithms like Hidden Markov Models and Dynamic Time Warping. The expected outcome is a more intuitive user experience that promotes inclusivity and adapts to individual speech patterns.

Uploaded by

Abhishek Khadka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views18 pages

Final Slide

Uploaded by

Abhishek Khadka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

SPEECH

RECOGNITION
SYSTEM
PRESENTERS

Nirajan
Khadka Saugat
Abhishek Dahal
Khadka

Presentation On Speech Recognition System 2

Table of Contents:
1. Introduction
2. Problem Statement
3. Objectives
4. Development Methodology
5. Working Mechanism
6. Algorithms Used
7. Challanges
8. Expected Outcome

01/21/2025 Presentation On Speech Recognition System 3

1. INTRODUCTION

• Speech recognition technology has recently reached a higher level of performance

and robustness, allowing it to communicate to another user by talking .

• A Process that enables the computers to recognize and translate spoken language
into text. It is also known as "automatic speech recognition" (ASR), "computer
speech recognition", or just "speech to text" (STT).

• Speech Recognition is process of decoding acoustic speech signal captured by

microphone or telephone ,to a set of words.

4
2. PROBLEM STATEMENT
• Existing speech recognition technology falls short in delivering the
required accuracy, adaptability, and accessibility across various
applications and languages.

• The key issues include inaccuracies in transcription due to accents and

noise, a lack of adaptability to specialized domains and languages, and
limited accessibility for individuals with hearing impairments and those
in resource-constrained areas.

• This project aims to tackle these challenges by creating a superior

speech recognition system that enhances communication, efficiency, and
inclusivity across diverse user groups and applications.
3. OBJECTIVES
Objectives of the Project:

• To develop a speech recognition system capable of transcribing

spoken language across various accents, dialects, and noisy
environments.

• To build a speech recognition system capable of producing the

results with great accuracy.

01/21/2025 Presentation On Speech Recognition System 6

4. DEVELOPMENT METHODOLOGY
Agile Development Approach

Fig: Phases of Agile Development

01/21/2025 Presentation On Speech Recognition System 7
5. WORKING MECHANISM
 Audio Input: The process begins with an "Audio Input," where the system receives
spoken language in the form of audio signals.
 Audio Pre-processing: Incoming audio undergoes "Audio Pre-processing" to
enhance its quality. This involves tasks like noise reduction, normalization, and
segmentation to prepare the data for analysis.
 Feature Extraction: The pre-processed audio is then subjected to "Feature
Extraction." This step converts raw audio signals into a more manageable and
informative format, typically using techniques like Mel-frequency cepstral
coefficients (MFCCs).
 Neural Network (ASR Model): The heart of the system is the "Neural Network
(ASR Model)." This deep learning architecture processes the extracted features to
decode spoken language into text. It learns to recognize phonemes, words, and
context through training on extensive datasets.
8
Contd..
 Language Models: To improve accuracy and context understanding, "Language Models"
are integrated into the system. These models help in predicting the most likely words or
phrases based on the context of the spoken words.
 Transcription Output: The final output is the "Transcription Output," where the system
provides a text-based representation of the spoken words.

Fig: Working mechanism of Speech Recognition System

9
FLOWCHART

01/21/2025 Presentation On Speech Recognition System 10

6. Algorithms Used

6. 1 Hidden Markov Models

• Machine learning method
• Makes use of state machines
• Based on probabilistic model
• Can only observe output from states, not the states themselves
• Example: speech recognition
• Observe: acoustic signals
• Hidden States: phonemes (distinctive sounds of a language)

01/21/2025 Presentation On Speech Recognition System 11

Contd.
Mathematically Interpretation of Hidden Markov Models
•Here's a simplified mathematical interpretation of HMMs for speech recognition:

•States: Let S = {S1, S2, ………., SN} represent the set of states (e.g., phonemes) in the HMM, where N
is the number of states.

•Transition Probabilities: Define A = {aij}, where aij is the probability of transitioning from state

Si to state Sj. These probabilities are typically organized into a transition matrix.

•Observations (Features): Let O represent the observed feature sequence, which consists of T
feature vectors: O = {O1, O2, ………..., OT}.

•Emission Probabilities: For each state Si, there's an emission probability distribution Bi that

describes
01/21/2025 the likelihood of observingPresentation
the feature vector
On Speech OSystem
Recognition T given the state Si: Bi (Ot) = P(Ot|Si). 12
Contd.
•Initialization: The initial state probabilities are represented by π = {πi} where π is the probability of starting in state

S i.

•Viterbi Algorithm: The Viterbi algorithm finds the best state sequence Q* = {q1*, q2*, ……, qT*} that maximizes
the joint probability:

•Q* = argmaxQP(O, Q) = argmaxQ [πq1 qt−1,qt⋅ Bqt(Ot)]

•In practice, the Viterbi algorithm efficiently computes the most likely state sequence by maintaining a trellis of
probabilities and backtracking to find the optimal path.

01/21/2025 Presentation On Speech Recognition System 13

6. 2 D ynamic Time Wrapping A lgorithm

• Dynamic Time Warping (DTW) is a technique used in speech recognition to align and compare sequences of feature
vectors, allowing for the recognition of spoken words or phrases with varying durations

Mathematically Interpretation of Dynamic Time Warping

 Let X= {x1, x2,…,xN} represent the reference sequence, and Y={y1,y2,…,yM} represent the input sequence.

 Define a distance or cost matrix C such that C[i][j] represents the cost of aligning xi from the reference sequence with y j
from the input sequence. This cost can be computed using a distance metric (e.g., Euclidean distance).
 Create a DTW matrix D with dimensions (N+1) ×(M+1), initialized with large values.
 Calculate the DTW matrix D using dynamic programming:

• D[i][j] =C[i][j] +min(D[i−1] [j], D[i][j−1], D[i−1] [j−1])

 Backtrack through the DTW matrix to find the optimal alignment path, which represents the alignment between X and Y.
 Compute a recognition score or distance measure based on the alignment path, and make a recognition decision based on
this score.
01/21/2025 Presentation On Speech Recognition System 14
7. CHALLENGES

• Ambient Noise

• Accents and Dialects

• Speaker Variability

• Limited Vocabulary

• Context Understanding

• Real Time Processing

01/21/2025 Presentation On Speech Recognition System 15

8. EXPECTED OUTCOME
• The system aims to enhance the user experience by delivering
accurate and efficient speech recognition, resulting in more
intuitive and convenient interactions with digital devices.

• By enabling voice interaction, the project promotes inclusivity

and ensures technology is accessible to a broader audience.

• It will make the technology more accessible to individuals with

disabilities and those facing language barriers. User will benefit
from a personalized experience as the system adapts to their
unique speech patterns and preference enhancing recognition
accuracy over time.
Any further inquiries you'd like to make?

01/21/2025 Presentation On Speech Recognition System 17

THANK YOU

Speechrecognitionfinalpresentation 141124072610 Conversion Gate01
No ratings yet
Speechrecognitionfinalpresentation 141124072610 Conversion Gate01
30 pages
Minor Project123
No ratings yet
Minor Project123
40 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Speech Recognition Course Guide
No ratings yet
Speech Recognition Course Guide
74 pages
3MCA67 Speech Recognition
No ratings yet
3MCA67 Speech Recognition
14 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Feature Extraction Using PCA
No ratings yet
Feature Extraction Using PCA
36 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Speech Recognition System Proposal
No ratings yet
Speech Recognition System Proposal
11 pages
Speech Recognition Report
100% (1)
Speech Recognition Report
20 pages
Term Paper ECE-300 Topic: - Speech Recognition
No ratings yet
Term Paper ECE-300 Topic: - Speech Recognition
14 pages
Speech Recognition Seminar Report
87% (97)
Speech Recognition Seminar Report
32 pages
Speech Recognition Seminar
100% (2)
Speech Recognition Seminar
19 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
35 pages
Research Method and Presentation (Mini Project Proposal)
No ratings yet
Research Method and Presentation (Mini Project Proposal)
26 pages
Project Report
No ratings yet
Project Report
17 pages
Phases of Speech Recognition
No ratings yet
Phases of Speech Recognition
2 pages
Speech Recognition in AI (COMP 334)
No ratings yet
Speech Recognition in AI (COMP 334)
26 pages
KY DSV
No ratings yet
KY DSV
7 pages
9 Speech Recognition
No ratings yet
9 Speech Recognition
26 pages
End-to-End Automatic Speech Recognition
No ratings yet
End-to-End Automatic Speech Recognition
19 pages
Speech Recognition Technology in A Ubiquitous Computing Environment
No ratings yet
Speech Recognition Technology in A Ubiquitous Computing Environment
24 pages
Speech Recognition
No ratings yet
Speech Recognition
11 pages
ASR Models: HMM vs. RNN
No ratings yet
ASR Models: HMM vs. RNN
8 pages
Piyu Sem Report.5
No ratings yet
Piyu Sem Report.5
30 pages
A Framework For Speech Recognition Development
No ratings yet
A Framework For Speech Recognition Development
23 pages
Speech Recognition Architecture - Detailed View: 1. Acoustic Front-End (Feature Extraction)
No ratings yet
Speech Recognition Architecture - Detailed View: 1. Acoustic Front-End (Feature Extraction)
3 pages
Speech Recognition System
No ratings yet
Speech Recognition System
5 pages
Speech Recognition PPT F
100% (3)
Speech Recognition PPT F
16 pages
Speech Recognition
No ratings yet
Speech Recognition
9 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
23 pages
Lecture 9
No ratings yet
Lecture 9
39 pages
Speech Recognition Project
No ratings yet
Speech Recognition Project
33 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
22 pages
Working of A Voice Recognition System
No ratings yet
Working of A Voice Recognition System
2 pages
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
No ratings yet
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
6 pages
Unit 5 UA
No ratings yet
Unit 5 UA
19 pages
Thesis-Speech Recognition Markov
No ratings yet
Thesis-Speech Recognition Markov
65 pages
Speech Recognition Seminar Report
No ratings yet
Speech Recognition Seminar Report
24 pages
A Report On
No ratings yet
A Report On
35 pages
Voice
No ratings yet
Voice
11 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Speech Recognition: BY Charu Joshi
100% (2)
Speech Recognition: BY Charu Joshi
26 pages
Analysis of Complex Non-Linear Environment Exploration in Speech Recognition by Hybrid Learning Technique
No ratings yet
Analysis of Complex Non-Linear Environment Exploration in Speech Recognition by Hybrid Learning Technique
8 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
A Review On Automatic Speech Recognition Architect
No ratings yet
A Review On Automatic Speech Recognition Architect
13 pages
Speech Recognition: A Seminar Report On
No ratings yet
Speech Recognition: A Seminar Report On
5 pages
Ann LA2 Project
No ratings yet
Ann LA2 Project
23 pages
Speech Recognition for Tech Enthusiasts
No ratings yet
Speech Recognition for Tech Enthusiasts
26 pages
ABSTRACT Seminar
No ratings yet
ABSTRACT Seminar
5 pages
Seminar Presentation: Topic: Speech Recognition
No ratings yet
Seminar Presentation: Topic: Speech Recognition
26 pages
Assignment Submission Speech Recognition System Architectural Design
No ratings yet
Assignment Submission Speech Recognition System Architectural Design
5 pages
A Study On Automatic Speech Recognition
100% (1)
A Study On Automatic Speech Recognition
2 pages
SPEECH RECOGNITION SYSTEM Final
No ratings yet
SPEECH RECOGNITION SYSTEM Final
16 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
17 pages
Speech Recognition Overview
0% (1)
Speech Recognition Overview
16 pages
Hidden Markov Model and Persian Speech Recognition
No ratings yet
Hidden Markov Model and Persian Speech Recognition
9 pages
MC 33030
No ratings yet
MC 33030
17 pages
BPMN Poster A3 Ver 1.0.7
No ratings yet
BPMN Poster A3 Ver 1.0.7
1 page
Set 3, CIE 2 U23MAT202A Discrete Mathematical Structures
No ratings yet
Set 3, CIE 2 U23MAT202A Discrete Mathematical Structures
2 pages
CSP Microproject-Numbered
No ratings yet
CSP Microproject-Numbered
23 pages
Easy Mobile
No ratings yet
Easy Mobile
42 pages
Simcom Sim5320 Atc en v1.23 PDF
No ratings yet
Simcom Sim5320 Atc en v1.23 PDF
498 pages
MATH
No ratings yet
MATH
91 pages
Course Outline For Advance Web Development Course
No ratings yet
Course Outline For Advance Web Development Course
3 pages
Game Programming Using QT 5 Beginner S Guide Second Edition Lorenz Haas Newest Edition 2025
No ratings yet
Game Programming Using QT 5 Beginner S Guide Second Edition Lorenz Haas Newest Edition 2025
93 pages
F24 MVC Assignment 2
No ratings yet
F24 MVC Assignment 2
11 pages
FOIA Handbook 2019 644053 7
No ratings yet
FOIA Handbook 2019 644053 7
51 pages
CN CS203 Lab Manual
No ratings yet
CN CS203 Lab Manual
36 pages
OOP Using Java Unit 2 Notes
No ratings yet
OOP Using Java Unit 2 Notes
20 pages
Microbiology An Evolving Science 4th Edition Slonczewski Digital Access
100% (2)
Microbiology An Evolving Science 4th Edition Slonczewski Digital Access
409 pages
The Lognormal Distribution: X Is Said To Have The
No ratings yet
The Lognormal Distribution: X Is Said To Have The
3 pages
Digital Transmission & Line Coding
No ratings yet
Digital Transmission & Line Coding
13 pages
VX-2100 - 2200 VHF 2013 Ec061n90k
No ratings yet
VX-2100 - 2200 VHF 2013 Ec061n90k
86 pages
Error TSV Tnew Page Alloc Failed Dump
No ratings yet
Error TSV Tnew Page Alloc Failed Dump
58 pages
Ey Erformance Ndicators: 4wire KPI Dashboard K P I
No ratings yet
Ey Erformance Ndicators: 4wire KPI Dashboard K P I
4 pages
EU eCTD Validation Criteria Guide
No ratings yet
EU eCTD Validation Criteria Guide
23 pages
Fnu Mohammed Ajaz
No ratings yet
Fnu Mohammed Ajaz
7 pages
SMG Release Notes 10 9 1
No ratings yet
SMG Release Notes 10 9 1
12 pages
Magnavox Plasma 42MF130A - SMA
No ratings yet
Magnavox Plasma 42MF130A - SMA
85 pages
Catalogo Compresor Bauer
No ratings yet
Catalogo Compresor Bauer
168 pages
Problem Set 1 Answer Sheet
No ratings yet
Problem Set 1 Answer Sheet
4 pages
CCTV and Access Control Specification
100% (1)
CCTV and Access Control Specification
9 pages
Infini-Solar V Protocol 20170926 (PI18)
No ratings yet
Infini-Solar V Protocol 20170926 (PI18)
8 pages
Pesco Slip
No ratings yet
Pesco Slip
1 page
Project - Akshay - 018 (FInal) PDF
No ratings yet
Project - Akshay - 018 (FInal) PDF
80 pages
SRM Institute of Science and Technology College of Engineering and Technology School of Computing Department of Computing Technologies
No ratings yet
SRM Institute of Science and Technology College of Engineering and Technology School of Computing Department of Computing Technologies
15 pages