0% found this document useful (0 votes)

181 views13 pages

Speech Recognition

This document summarizes a student project on speech recognition: 1. The project aims to convert speech to text and allow browsing the internet via voice commands, similar to Google Assistant. 2. It also aims to read text documents aloud and make browsing and accessing documents easier. 3. The technologies used include Python libraries for text-to-speech conversion (gTTS), speech recognition (Google Speech Recognition), and audio manipulation (PyAudio).

Uploaded by

Siva Badrinath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

181 views13 pages

Speech Recognition

Uploaded by

Siva Badrinath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Speech Recognition

BATCH 10
Team

Guide: Mr. VSVS Murthy

Team members:

M Siva Badarinath 1215316025

Team
G Manish Reddy 1215316015
M Sumanth 1215316027
Y Raviteja 1215316059
Project Brief:

Technlogies used:

Python

libraries: gTTs, google Speech Recognition, PyAudio

Abstract:

• The aim of project is to convert speech to text and browse

the internet with voice commands
• (much like google assistant)

• Read the Text documents in english language and convert it

into to Voice.

• To make Browising the net more comfortable and access

the documents more eaiser.
gTTS : Google Text to Speech

• a Python library and tool to interface with Google Translate’s

text-to-speech API.
• Writes spoken mp3 data to a file, a file-like object (bytestring) for
further audio manipulation, or stdout.
• It features flexible pre-processing and tokenizing, as well as
automatic retrieval of supported languages
PyAudio :

• PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library.
• With PyAudio, you can easily use Python to play and record audio on a variety of platforms.

• PyAudio is inspired by:

• pyPortAudio/fastaudio: Python bindings for PortAudio v18 API.

Speech Recognition

• Speech Recognition is an important feature in several applications used such as

home automation, artificial intelligence, etc.

• This aims to provide an introduction on how to make use of the

SpeechRecognition library of Python

• This is useful as it can be used on microcontrollers such as Raspberri Pis with

the help of an external microphone
Outcome:

• Can detect Different Voices

• generates audio files from the given text

• Reduces the time for Browsing or typing files

Conclusion 1
We have analysed and looked into the various parameters in order to take the
conventional voice engines to the next level. Our model demonstrates how the voice
engine be used to perform real-time activities such as controlling of appliances with
mere voice commands and no physical movement. The model is currently developed on
the base version of Android Froyo 2.2 and is possible to make it compatible to the later
versions by following certain procedures.

The principle of DTMF signals to demonstrate the automation of electrical appliances

has been followed as the hardware setup is cost effective, and any individual can
implement the same without much expenditure.

Author: P. Magesh Kannan

Date of Conference: 28-30 Aug. 2014
Conclusion 2
In this paper the results of the time optimization of the real-time speaker recognition
system were presented. The obtained parameters prove that the system accuracy can be
held on the same level (or even increased) while reducing the number of computations
related to the MFCC (8 time less FFT computations) and GMM (twice less computations)
algorithms. As a consequence, the sampling rate can be increased to provide more
accurate real time speaker recognition (more than 10 %).

Author :Radosław Weychan

Date of Conference: 23-25 Sept. 2015
conclusion 3

This research project has developed a technique on converting a

text image directly to speech using Python and Raspberry Pi3
minicomputer. The hardware provides a portable and economical
way of converting an image to text. Our method is more reliable
than others as Tesseract OCR has an accuracy of 99% and eSpeak
uses two methods to read out the image with more human
compassion.

H
Author
asan U. Zaman
Date of Conference: 26-28 Oct. 2018
Conclusion 4
In this work an end to end speech to text conversion model using neural networks is
implemented. Techniques such as max pooling and batch normalization are used to
further optimize the model and boost its accuracy. The process of porting the trained
model to a Raspberry pi is explained. The usage of these kind of neural network models
is confined to the labels used in the dataset. Better datasets with more labels and
inclusion of various accents improve the application efficiency

Author :A. Pardha Saradhi

Date of conference :6, April 2019
THANK YOU

Speech To Text Conversion
No ratings yet
Speech To Text Conversion
7 pages
Text-to-Speech for Accessibility
No ratings yet
Text-to-Speech for Accessibility
2 pages
DL Proj Rep
No ratings yet
DL Proj Rep
11 pages
AI Assistant PBL Project
No ratings yet
AI Assistant PBL Project
13 pages
Voice Assistant
No ratings yet
Voice Assistant
34 pages
Voice Assistant - Doge: Bachelor of Engineering IN Computer Science & Engineering
No ratings yet
Voice Assistant - Doge: Bachelor of Engineering IN Computer Science & Engineering
48 pages
Labs 9
No ratings yet
Labs 9
4 pages
7sem Projectreport
No ratings yet
7sem Projectreport
33 pages
Speech To Text
No ratings yet
Speech To Text
17 pages
Speech Recognition System Using Python Report
No ratings yet
Speech Recognition System Using Python Report
7 pages
Speech & Language Tech for Students
No ratings yet
Speech & Language Tech for Students
28 pages
Voice Assistant
No ratings yet
Voice Assistant
30 pages
Iaesarticle
No ratings yet
Iaesarticle
10 pages
Py Report
No ratings yet
Py Report
8 pages
Python Speech Recognition Guide
No ratings yet
Python Speech Recognition Guide
18 pages
Final
No ratings yet
Final
12 pages
Synopsis
No ratings yet
Synopsis
6 pages
Modal Poster Template (Autosaved)
No ratings yet
Modal Poster Template (Autosaved)
1 page
AI Desktop Assistant Project
No ratings yet
AI Desktop Assistant Project
14 pages
Evaluation of State of Art Open-Source ASR Engines With Local Inferencing
No ratings yet
Evaluation of State of Art Open-Source ASR Engines With Local Inferencing
81 pages
Major Project SEE Progress Report
No ratings yet
Major Project SEE Progress Report
35 pages
Synopsis
No ratings yet
Synopsis
5 pages
JARVIS A PC Voice Assistant
No ratings yet
JARVIS A PC Voice Assistant
9 pages
Project Synopsis
No ratings yet
Project Synopsis
6 pages
Voice Assistant AI Python
No ratings yet
Voice Assistant AI Python
10 pages
Speech Recognition Techniques - GUVI
No ratings yet
Speech Recognition Techniques - GUVI
4 pages
Department of Mechanical Engineering: Mini Project Phase 1 Presentation
No ratings yet
Department of Mechanical Engineering: Mini Project Phase 1 Presentation
12 pages
Voice Recognition & Text-to-Speech
No ratings yet
Voice Recognition & Text-to-Speech
6 pages
Personal Voice Assistant
No ratings yet
Personal Voice Assistant
7 pages
Thank You
No ratings yet
Thank You
23 pages
DL Based Speech To Text Converter For Audio Visual Applications
No ratings yet
DL Based Speech To Text Converter For Audio Visual Applications
4 pages
Assistant Using Python
No ratings yet
Assistant Using Python
4 pages
Iarjset 2022 9216
No ratings yet
Iarjset 2022 9216
5 pages
Speech To Image Conversion: Shaik Karishma, Siddu Devi Naga Susmitha, Nanditha Katari, G. Sirisha
No ratings yet
Speech To Image Conversion: Shaik Karishma, Siddu Devi Naga Susmitha, Nanditha Katari, G. Sirisha
5 pages
Voice Assistant Project Report
No ratings yet
Voice Assistant Project Report
3 pages
NLP Mini Project Report
No ratings yet
NLP Mini Project Report
8 pages
Research Paper Publish
No ratings yet
Research Paper Publish
8 pages
Voice-to-Text Tool for Students
No ratings yet
Voice-to-Text Tool for Students
13 pages
Speech Recognition System
No ratings yet
Speech Recognition System
16 pages
Paper 4
No ratings yet
Paper 4
5 pages
Personal Voice Assistant in Python
86% (22)
Personal Voice Assistant in Python
30 pages
Multilingual Translator Tool
No ratings yet
Multilingual Translator Tool
16 pages
Ai Voice Assistant PPT Project
0% (1)
Ai Voice Assistant PPT Project
22 pages
Jarvis Voice Assistant For PC
No ratings yet
Jarvis Voice Assistant For PC
10 pages
Personal Voice Assistant in Python
100% (1)
Personal Voice Assistant in Python
30 pages
Department of Computer Science and Engineering) : CGB1121/ EGB1122
No ratings yet
Department of Computer Science and Engineering) : CGB1121/ EGB1122
18 pages
Python Report
No ratings yet
Python Report
6 pages
CPP Project Report
No ratings yet
CPP Project Report
15 pages
Minor Project Sem 2
No ratings yet
Minor Project Sem 2
35 pages
Voice Assistant Using Python 2
No ratings yet
Voice Assistant Using Python 2
20 pages
Voice - Assistant - Research Paper
No ratings yet
Voice - Assistant - Research Paper
6 pages
Voice - Assistant - Research Paper
No ratings yet
Voice - Assistant - Research Paper
4 pages
Voice Recognition System: Speech-To-Text: Journal of Applied and Fundamental Sciences November 2015
No ratings yet
Voice Recognition System: Speech-To-Text: Journal of Applied and Fundamental Sciences November 2015
6 pages
Voice Recognition System: Speech-To-Text: Journal of Applied and Fundamental Sciences November 2015
No ratings yet
Voice Recognition System: Speech-To-Text: Journal of Applied and Fundamental Sciences November 2015
6 pages
AI-based Desktop Voice Assistant
No ratings yet
AI-based Desktop Voice Assistant
4 pages
Text-to-Speech Conversion Guide
No ratings yet
Text-to-Speech Conversion Guide
8 pages
Representation Analysis Methods - For Translation
No ratings yet
Representation Analysis Methods - For Translation
218 pages
103 359 1 PB
No ratings yet
103 359 1 PB
6 pages
Development of Multilingual Speech
No ratings yet
Development of Multilingual Speech
13 pages
Quality Function Deployment
No ratings yet
Quality Function Deployment
15 pages
Module IV (Ii)
No ratings yet
Module IV (Ii)
3 pages
Customer Relationship Manmagement: Learning Aspects
No ratings yet
Customer Relationship Manmagement: Learning Aspects
19 pages
Multi-Language Image to Speech Conversion
No ratings yet
Multi-Language Image to Speech Conversion
31 pages
New DOCX Document
No ratings yet
New DOCX Document
2 pages
Smart Home Control Using Labview
No ratings yet
Smart Home Control Using Labview
26 pages
Final Report 2020
No ratings yet
Final Report 2020
26 pages
Python ML Internship Report
50% (2)
Python ML Internship Report
29 pages
Smart Home LabVIEW Project
No ratings yet
Smart Home LabVIEW Project
19 pages
Music Sampling: A Historical Overview
No ratings yet
Music Sampling: A Historical Overview
10 pages
论文装订墨尔本
100% (1)
论文装订墨尔本
4 pages
IoT Course: Protocols & Applications
No ratings yet
IoT Course: Protocols & Applications
2 pages
Easy-PhotoPrintEditor V1.9 Win Mac TC V01
No ratings yet
Easy-PhotoPrintEditor V1.9 Win Mac TC V01
220 pages
BS en Iso 9073 10 2004
No ratings yet
BS en Iso 9073 10 2004
20 pages
Uni-Trend Product Catalog 2022
No ratings yet
Uni-Trend Product Catalog 2022
194 pages
TCP Client-Server Example
No ratings yet
TCP Client-Server Example
26 pages
Tornado Dynamics Course at University by Slidesgo
No ratings yet
Tornado Dynamics Course at University by Slidesgo
53 pages
59 - Hp-Bios
No ratings yet
59 - Hp-Bios
8 pages
Integra MicroFrance. Integra ENT - Throat Catalogue. Document For Use in Europe, Middle-East and Africa Only.
No ratings yet
Integra MicroFrance. Integra ENT - Throat Catalogue. Document For Use in Europe, Middle-East and Africa Only.
100 pages
Checklist
No ratings yet
Checklist
1 page
1/2TF24FCN@7 Spray Nozzle Data Sheet
No ratings yet
1/2TF24FCN@7 Spray Nozzle Data Sheet
2 pages
01 Luzon Grid Loepp-11302022
No ratings yet
01 Luzon Grid Loepp-11302022
4 pages
Mini Project2 Abstractt
No ratings yet
Mini Project2 Abstractt
1 page
PPS 3.3 C English - R02 11.2015
No ratings yet
PPS 3.3 C English - R02 11.2015
2 pages
Foxit PDF SDK: Bookmark Guide
No ratings yet
Foxit PDF SDK: Bookmark Guide
8 pages
Manuals Guides Controlwave Electronic Flow Meter Efm en 132638
No ratings yet
Manuals Guides Controlwave Electronic Flow Meter Efm en 132638
146 pages
Cellular Systems & Strategies Guide
No ratings yet
Cellular Systems & Strategies Guide
58 pages
ONE Workflow Commiss Man 1020 en-US
No ratings yet
ONE Workflow Commiss Man 1020 en-US
126 pages
SS1 HTML&CSS PDF
No ratings yet
SS1 HTML&CSS PDF
14 pages
ProCAST Training: Mould & Die, UMP
100% (1)
ProCAST Training: Mould & Die, UMP
1 page
Matrices Practice
No ratings yet
Matrices Practice
87 pages
Fire Extinguisher Safety Guide
No ratings yet
Fire Extinguisher Safety Guide
28 pages
Functions of Students W.R.T Tarun Er Swapna Scheme
No ratings yet
Functions of Students W.R.T Tarun Er Swapna Scheme
5 pages
LR Source 4
No ratings yet
LR Source 4
15 pages
Project Management Essentials
No ratings yet
Project Management Essentials
6 pages
ECOMILL - Letter To Minister of Plantation
No ratings yet
ECOMILL - Letter To Minister of Plantation
3 pages
ARM37
No ratings yet
ARM37
5 pages
Business Correspondence Guide
No ratings yet
Business Correspondence Guide
4 pages
You Can Win A Step-by-Step Tool For Top Achiever PDF
No ratings yet
You Can Win A Step-by-Step Tool For Top Achiever PDF
1 page

Speech Recognition

Uploaded by

Speech Recognition

Uploaded by

Speech Recognition

Guide: Mr. VSVS Murthy

M Siva Badarinath 1215316025

libraries: gTTs, google Speech Recognition, PyAudio

• The aim of project is to convert speech to text and browse

• Read the Text documents in english language and convert it

• To make Browising the net more comfortable and access

• a Python library and tool to interface with Google Translate’s

• PyAudio is inspired by:

• pyPortAudio/fastaudio: Python bindings for PortAudio v18 API.

• Speech Recognition is an important feature in several applications used such as

• This aims to provide an introduction on how to make use of the

• This is useful as it can be used on microcontrollers such as Raspberri Pis with

• Can detect Different Voices

• generates audio files from the given text

• Reduces the time for Browsing or typing files

The principle of DTMF signals to demonstrate the automation of electrical appliances

Author: P. Magesh Kannan

Author :Radosław Weychan

This research project has developed a technique on converting a

Author :A. Pardha Saradhi

You might also like