0% found this document useful (0 votes)

51 views26 pages

Pydub

Uploaded by

sehibiyaoblaise

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views26 pages

Pydub

Uploaded by

sehibiyaoblaise

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Introduction to

PyDub
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Installing PyDub
$ pip install pydub

If using les other than .wav , install ffmpeg via mpeg.org

SPOKEN LANGUAGE PROCESSING IN PYTHON

PyDub's main class, AudioSegment
# Import PyDub main class
from pydub import AudioSegment

# Import an audio file

wav_file = AudioSegment.from_file(file="wav_file.wav", format="wav")

# Format parameter only for readability

wav_file = AudioSegment.from_file(file="wav_file.wav")

type(wav_file)

pydub.audio_segment.AudioSegment

SPOKEN LANGUAGE PROCESSING IN PYTHON

Playing an audio file
# Install simpleaudio for wav playback
$pip install simpleaudio

# Import play function

from pydub.playback import play

# Import audio file

wav_file = AudioSegment.from_file(file="wav_file.wav")

# Play audio file

play(wav_file)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Audio parameters
# Import audio files
wav_file = AudioSegment.from_file(file="wav_file.wav")
two_speakers = AudioSegment.from_file(file="two_speakers.wav")
# Check number of channels
wav_file.channels, two_speakers.channels

1, 2

wav_file.frame_rate

480000

SPOKEN LANGUAGE PROCESSING IN PYTHON

Audio parameters
# Find the number of bytes per sample
wav_file.sample_width

# Find the max amplitude

wav_file.max

8488

SPOKEN LANGUAGE PROCESSING IN PYTHON

Audio parameters
# Duration of audio file in milliseconds
len(wav_file)

3284

SPOKEN LANGUAGE PROCESSING IN PYTHON

Changing audio parameters
# Change ATTRIBUTENAME of AudioSegment to x
changeed_audio_segment = audio_segment.set_ATTRIBUTENAME(x)

# Change sample width to 1

wav_file_width_1 = wav_file.sample_width(1)
wav_file_width_1.sample_width

SPOKEN LANGUAGE PROCESSING IN PYTHON

Changing audio parameters
# Change sample rate
wav_file_16k = wav_file.frame_rate(16000)
wav_file_16k.frame_rate

16000

# Change number of channels

wav_file_1_channel = wav_file.set_channels(1)
wav_file_1_channel.channels

SPOKEN LANGUAGE PROCESSING IN PYTHON

Let's practice!
SPOKEN LANGUAGE PROCESSING IN PYTHON
Manipulating audio
files with PyDub
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Turning it down to 11
# Import audio file
wav_file = AudioSegment.from_file("wav_file.wav")
# Minus 60 dB
quiet_wav_file = wav_file - 60

# Try to recognize quiet audio

recognizer.recognize_google(quiet_wav_file)

UnknownValueError:

SPOKEN LANGUAGE PROCESSING IN PYTHON

Increasing the volume
# Increase the volume by 10 dB
louder_wav_file = wav_file + 10

# Try to recognize
recognizer.recognize_google(louder_wav_file)

this is a wav file

SPOKEN LANGUAGE PROCESSING IN PYTHON

This all sounds the same
# Import AudioSegment and normalize
from pydub import AudioSegment
from pydub.effects import normalize
from pydub.playback import play

# Import uneven sound audio file

loud_quiet = AudioSegment.from_file("loud_quiet.wav")
# Normalize the sound levels
normalized_loud_quiet = normalize(loud_quiet)

# Check the sound

play(normalized_loud_quiet)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Remixing your audio files
# Import audio with static at start
static_at_start = AudioSegment.from_file("static_at_start.wav")

# Remove the static via slicing

no_static_at_start = static_at_start[5000:]

# Check the new sound

play(no_static_at_start)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Remixing your audio files
# Import two audio files
wav_file_1 = AudioSegment.from_file("wav_file_1.wav")
wav_file_2 = AudioSegment.from_file("wav_file_2.wav")

# Combine the two audio files

wav_file_3 = wav_file_1 + wav_file_2

# Check the sound

play(wav_file_3)

# Combine two wav files and make the combination louder

louder_wav_file_3 = wav_file_1 + wav_file_2 + 10

SPOKEN LANGUAGE PROCESSING IN PYTHON

Splitting your audio
# Import phone call audio
phone_call = AudioSegment.from_file("phone_call.wav")
# Find number of channels
phone_call.channels

# Split stereo to mono

phone_call_channels = phone_call.split_to_mono()
phone_call_channels

[<pydub.audio_segment.AudioSegment, <pydub.audio_segment.AudioSegment>]

SPOKEN LANGUAGE PROCESSING IN PYTHON

Splitting your audio
# Find number of channels of first list item
phone_call_channels[0].channels

# Recognize the first channel

recognizer.recognize_google(phone_call_channel_1)

the pydub library is really useful

SPOKEN LANGUAGE PROCESSING IN PYTHON

Let's code!
SPOKEN LANGUAGE PROCESSING IN PYTHON
Converting and
saving audio files
with PyDub
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Exporting audio files
from pydub import AudioSegment

# Import audio file

wav_file = AudioSegment.from_file("wav_file.wav")
# Increase by 10 decibels
louder_wav_file = wav_file + 10
# Export louder audio file
louder_wav_file.export(out_f="louder_wav_file.wav", format="wav")

<_io.BufferedRandom name='louder_wav_file.wav'>

SPOKEN LANGUAGE PROCESSING IN PYTHON

Reformatting and exporting multiple audio files
def make_wav(wrong_folder_path, right_folder_path):
# Loop through wrongly formatted files
for file in os.scandir(wrong_folder_path):
# Only work with files with audio extensions we're fixing
if file.path.endswith(".mp3") or file.path.endswith(".flac"):
# Create the new .wav filename
out_file = right_folder_path + os.path.splitext(os.path.basename(file.path))[0] + ".wav"
# Read in the audio file and export it in wav format
AudioSegment.from_file(file.path).export(out_file,
format="wav")
print(f"Creating {out_file}")

SPOKEN LANGUAGE PROCESSING IN PYTHON

Reformatting and exporting multiple audio files
# Call our new function
make_wav("data/wrong_formats/", "data/right_format/")

Creating data/right_types/wav_file.wav
Creating data/right_types/flac_file.wav
Creating data/right_types/mp3_file.wav

SPOKEN LANGUAGE PROCESSING IN PYTHON

Manipulating and exporting
def make_no_static_louder(static_quiet, louder_no_static):
# Loop through files with static and quiet (already in wav format)
for file in os.scandir(static_quiet_folder_path):
# Create new file path
out_file = louder_no_static + os.path.splitext(os.path.basename(file.path))[0] + ".wav"
# Read the audio file
audio_file = AudioSegment.from_file(file.path)
# Remove first three seconds and add 10 decibels and export
audio_file = (audio_file[3100:] + 10).export(out_file, format="wav")

print(f"Creating {out_file}")

SPOKEN LANGUAGE PROCESSING IN PYTHON

Manipulating and exporting
# Remove static and make louder
make_no_static_louder("data/static_quiet/", "data/louder_no_static/")

Creating data/louder_no_static/speech-recognition-services.wav
Creating data/louder_no_static/order-issue.wav
Creating data/louder_no_static/help-with-acount.wav

SPOKEN LANGUAGE PROCESSING IN PYTHON

Your turn!
SPOKEN LANGUAGE PROCESSING IN PYTHON

Spoken Language Processing in Python Chapter3
No ratings yet
Spoken Language Processing in Python Chapter3
26 pages
Spoken Language Processing in Python Chapter1
No ratings yet
Spoken Language Processing in Python Chapter1
17 pages
Speech Recognition
No ratings yet
Speech Recognition
5 pages
Chapter 1
No ratings yet
Chapter 1
17 pages
Python SpeechRecognition Guide
No ratings yet
Python SpeechRecognition Guide
23 pages
Voice Assistant Report
No ratings yet
Voice Assistant Report
4 pages
Spoken Language Processing in Python Chapter4
No ratings yet
Spoken Language Processing in Python Chapter4
46 pages
WAV File Processing Guide
No ratings yet
WAV File Processing Guide
24 pages
Department of Computer Science and Engineering) : CGB1121/ EGB1122
No ratings yet
Department of Computer Science and Engineering) : CGB1121/ EGB1122
18 pages
Lecture
No ratings yet
Lecture
7 pages
Python Audio Processing Guide
No ratings yet
Python Audio Processing Guide
4 pages
Digital Signal Processing Report
No ratings yet
Digital Signal Processing Report
20 pages
Sec 5 - Audio Signal Acquisition - Record & Load mp3
No ratings yet
Sec 5 - Audio Signal Acquisition - Record & Load mp3
9 pages
Week-8 NLP Lab Program
No ratings yet
Week-8 NLP Lab Program
6 pages
Chat Bot 1
No ratings yet
Chat Bot 1
7 pages
Voice Assistant - Doge: Bachelor of Engineering IN Computer Science & Engineering
No ratings yet
Voice Assistant - Doge: Bachelor of Engineering IN Computer Science & Engineering
48 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
9 pages
Sound Processing
No ratings yet
Sound Processing
22 pages
Voice Search Using Python: B Pavan Kumar 16BD1A051R
No ratings yet
Voice Search Using Python: B Pavan Kumar 16BD1A051R
11 pages
Python Virtual Assistant Guide
No ratings yet
Python Virtual Assistant Guide
8 pages
Voice Assistant Report 40 Pages
No ratings yet
Voice Assistant Report 40 Pages
44 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
10 pages
Voice Assistant Suggetion
No ratings yet
Voice Assistant Suggetion
3 pages
Project Documentation: Muhammad Munib Muhammad Afaaf
No ratings yet
Project Documentation: Muhammad Munib Muhammad Afaaf
11 pages
Jarvis Voice Assistant
No ratings yet
Jarvis Voice Assistant
2 pages
Code
No ratings yet
Code
4 pages
Voice M
No ratings yet
Voice M
19 pages
Reading and Writing WAV Files in Python - Real Python
No ratings yet
Reading and Writing WAV Files in Python - Real Python
86 pages
Audio Signal Processing Basics
100% (1)
Audio Signal Processing Basics
55 pages
Speech To Text Conversion
No ratings yet
Speech To Text Conversion
7 pages
Python Speech Recognition Guide
No ratings yet
Python Speech Recognition Guide
25 pages
Voice Assistant Project
No ratings yet
Voice Assistant Project
2 pages
Sound Processing
No ratings yet
Sound Processing
9 pages
Create Audio Effects App in Python
No ratings yet
Create Audio Effects App in Python
5 pages
DSPA - ET22BTEC046 - LAB3.ipynb - Colab
No ratings yet
DSPA - ET22BTEC046 - LAB3.ipynb - Colab
7 pages
Code Python
No ratings yet
Code Python
1 page
Sre Assignment
No ratings yet
Sre Assignment
15 pages
Priyank Dewashish
No ratings yet
Priyank Dewashish
15 pages
Vioce Assistant by Python
No ratings yet
Vioce Assistant by Python
38 pages
Speech Recognition System
No ratings yet
Speech Recognition System
16 pages
Data Sorting Guideline
No ratings yet
Data Sorting Guideline
2 pages
Artificial Intelligence Project Report-Ads18a00095y
No ratings yet
Artificial Intelligence Project Report-Ads18a00095y
3 pages
Training Project - Pptyx
No ratings yet
Training Project - Pptyx
11 pages
Voice Identification GLM4 Guide
No ratings yet
Voice Identification GLM4 Guide
2 pages
Labs 9
No ratings yet
Labs 9
4 pages
Speech To Text
No ratings yet
Speech To Text
17 pages
Aa Alexa
No ratings yet
Aa Alexa
3 pages
Assistant
No ratings yet
Assistant
2 pages
A1-Python and Sounds PDF
No ratings yet
A1-Python and Sounds PDF
4 pages
Jarvis
No ratings yet
Jarvis
2 pages
Digital Speech Processing - Lecture 1
No ratings yet
Digital Speech Processing - Lecture 1
39 pages
Desktop Assistant Final
No ratings yet
Desktop Assistant Final
15 pages
Project Report
No ratings yet
Project Report
58 pages
2.5 Automatic Speech Recognition
No ratings yet
2.5 Automatic Speech Recognition
8 pages
Digital Speech Processing
No ratings yet
Digital Speech Processing
46 pages
Sphinx Speech Recognition
No ratings yet
Sphinx Speech Recognition
5 pages
Import Subprocess
No ratings yet
Import Subprocess
11 pages
Jarvis
No ratings yet
Jarvis
2 pages
Cython Openmp
No ratings yet
Cython Openmp
8 pages
Cython Tutorial - How To Speed Up Python - InfoWorld
No ratings yet
Cython Tutorial - How To Speed Up Python - InfoWorld
10 pages
Frequency and Pitch
No ratings yet
Frequency and Pitch
3 pages
Separate Vocals From A Track Using Python - DEV Community
No ratings yet
Separate Vocals From A Track Using Python - DEV Community
5 pages
Learn The Architecture - Optimizing C Code With Neon Intrinsics 102467 0201 02 en
No ratings yet
Learn The Architecture - Optimizing C Code With Neon Intrinsics 102467 0201 02 en
40 pages
Audio Fingerprinting With Python and Numpy
No ratings yet
Audio Fingerprinting With Python and Numpy
13 pages
Harnessing AI For Smart Marketing
No ratings yet
Harnessing AI For Smart Marketing
9 pages
Title: - Cyber Law and Ipr Issues: The Indian Perspective: Submitted To:-Prof. Harsh Kumar
No ratings yet
Title: - Cyber Law and Ipr Issues: The Indian Perspective: Submitted To:-Prof. Harsh Kumar
9 pages
PPS 3.3 C English - R02 11.2015
No ratings yet
PPS 3.3 C English - R02 11.2015
2 pages
Energy-Efficient Data Center Guide
100% (1)
Energy-Efficient Data Center Guide
48 pages
UVa Problem List Catagorized Algorithmic Problem PDF
No ratings yet
UVa Problem List Catagorized Algorithmic Problem PDF
8 pages
Proceeding
No ratings yet
Proceeding
380 pages
TCP Client-Server Example
No ratings yet
TCP Client-Server Example
26 pages
Robotics and Mechatronics
No ratings yet
Robotics and Mechatronics
23 pages
Manish Resume Feb25
No ratings yet
Manish Resume Feb25
2 pages
Solutions 131
No ratings yet
Solutions 131
1 page
Final AMAZON GC METHOD
No ratings yet
Final AMAZON GC METHOD
5 pages
Fittings PDF
No ratings yet
Fittings PDF
3 pages
ChatGPT Prompts for Marketers
100% (6)
ChatGPT Prompts for Marketers
7 pages
Modern Data Science with R 1st Baumer Solution Manual pdf available
100% (7)
Modern Data Science with R 1st Baumer Solution Manual pdf available
107 pages
Fault Level FY 22-23
No ratings yet
Fault Level FY 22-23
15 pages
Terms for Zongads Customers
No ratings yet
Terms for Zongads Customers
4 pages
Mivi Bill Buds
No ratings yet
Mivi Bill Buds
3 pages
Client Onboarding Kyc Aml Compliance
No ratings yet
Client Onboarding Kyc Aml Compliance
2 pages
One Plus
No ratings yet
One Plus
57 pages
Job Description: Position/Title: Department: Level: Location: Shifts (If Any) About ATG
No ratings yet
Job Description: Position/Title: Department: Level: Location: Shifts (If Any) About ATG
2 pages
2.1 Informix HighAvailability and Scalability
No ratings yet
2.1 Informix HighAvailability and Scalability
102 pages
In-House Cash
No ratings yet
In-House Cash
5 pages
Alfamart CMR For ACR2225 - KAS Validation
No ratings yet
Alfamart CMR For ACR2225 - KAS Validation
5 pages
Summary Report of ISO-IEC 10995 Test Program
No ratings yet
Summary Report of ISO-IEC 10995 Test Program
7 pages
Brake Control Valve (Service
No ratings yet
Brake Control Valve (Service
5 pages
Hand Note
No ratings yet
Hand Note
97 pages
Email Etiquette: - Guidelines, Do'S & - Guidelines, Do'S & Don'Ts
No ratings yet
Email Etiquette: - Guidelines, Do'S & - Guidelines, Do'S & Don'Ts
14 pages
CO2 Transfer Pump 100D1 New
No ratings yet
CO2 Transfer Pump 100D1 New
2 pages
XY-DT01 Regulator Temperatury 12V 24V Na Szynę Din
No ratings yet
XY-DT01 Regulator Temperatury 12V 24V Na Szynę Din
4 pages
19166/sabarmati Exp Third Ac (3A)
No ratings yet
19166/sabarmati Exp Third Ac (3A)
3 pages

Pydub

Uploaded by

Pydub

Uploaded by

Introduction to

If using les other than .wav , install ffmpeg via mpeg.org

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Import an audio file

# Format parameter only for readability

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Import play function

# Import audio file

# Play audio file

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Find the max amplitude

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Change sample width to 1

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Change number of channels

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Try to recognize quiet audio

SPOKEN LANGUAGE PROCESSING IN PYTHON

this is a wav file

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Import uneven sound audio file

# Check the sound

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Remove the static via slicing

# Check the new sound

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Combine the two audio files

# Check the sound

# Combine two wav files and make the combination louder

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Split stereo to mono

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Recognize the first channel

the pydub library is really useful

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Import audio file

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

You might also like