0% found this document useful (0 votes)

7 views17 pages

Chapter 1

The document provides an introduction to handling audio data in Python, covering various audio file formats and their frequency characteristics. It explains how to open audio files, convert sound wave bytes to integers, find frame rates, and visualize sound waves using libraries like NumPy and Matplotlib. Practical examples are included to demonstrate the process of working with audio data effectively.

Uploaded by

Andreu Orestes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views17 pages

Chapter 1

Uploaded by

Andreu Orestes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Introduction to audio

data in Python
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Dealing with audio files in Python
Different kinds all of audio files
mp3

wav

m4a

flac

Digital sounds measured in frequency (kHz)

1 kHz = 1000 pieces of information per second

SPOKEN LANGUAGE PROCESSING IN PYTHON

Frequency examples
Streaming songs have a frequency of 32 kHz

Audiobooks and spoken language are between 8 and 16 kHz

We can't see audio files so we have to transform them first

import wave

SPOKEN LANGUAGE PROCESSING IN PYTHON

Opening an audio file in Python
Audio file saved as good-morning.wav

# Import audio file as wave object

good_morning = wave.open("good-morning.wav", "r")

# Convert wave object to bytes

good_morning_soundwave = good_morning.readframes(-1)

# View the wav file in byte form

good_morning_soundwave

b'\xfd\xff\xfb\xff\xf8\xff\xf8\xff\xf7\...

SPOKEN LANGUAGE PROCESSING IN PYTHON

Working with audio is different
Have to convert the audio to something useful

Small sample of audio = large amount of information

SPOKEN LANGUAGE PROCESSING IN PYTHON

Let's practice!
SPOKEN LANGUAGE PROCESSING IN PYTHON
Converting sound
wave bytes to
integers
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Converting bytes to integers
Can't use bytes
Convert bytes to integers using numpy

import numpy as np
# Convert soundwave_gm from bytes to integers
signal_gm = np.frombuffer(soundwave_gm, dtype='int16')
# Show the first 10 items
signal_gm[:10]

array([ -3, -5, -8, -8, -9, -13, -8, -10, -9, -11], dtype=int16)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Finding the frame rate
Frequency (Hz) = length of wave object array/duration of audio file (seconds)

# Get the frame rate

framerate_gm = good_morning.getframerate()
# Show the frame rate
framerate_gm

48,000

Duration of audio file (seconds) = length of wave object array/frequency (Hz)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Finding sound wave timestamps
# Return evenly spaced values between start and stop
np.linspace(start=1, stop=10, num=10)

array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])

# Get the timestamps of the good morning sound wave

time_gm = np.linspace(start=0,
stop=len(soundwave_gm)/framerate_gm,
num=len(soundwave_gm))

SPOKEN LANGUAGE PROCESSING IN PYTHON

Finding sound wave timestamps
# View first 10 time stamps of good morning sound wave
time_gm[:10]

array([0.00000000e+00, 2.08334167e-05, 4.16668333e-05, 6.25002500e-05,

8.33336667e-05, 1.04167083e-04, 1.25000500e-04, 1.45833917e-04,
1.66667333e-04, 1.87500750e-04])

SPOKEN LANGUAGE PROCESSING IN PYTHON

Let's practice!
SPOKEN LANGUAGE PROCESSING IN PYTHON
Visualizing sound
waves
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Adding another sound wave
New audio file: good_afternoon.wav
Both are 48 kHz

Same data transformations to all audio files

SPOKEN LANGUAGE PROCESSING IN PYTHON

Setting up a plot
import matplotlib.pyplot as plt
# Initialize figure and setup title
plt.title("Good Afternoon vs. Good Morning")
# x and y axis labels
plt.xlabel("Time (seconds)")
plt.ylabel("Amplitude")
# Add good morning and good afternoon values
plt.plot(time_ga, soundwave_ga, label ="Good Afternoon")
plt.plot(time_gm, soundwave_gm, label="Good Morning",
alpha=0.5)
# Create a legend and show our plot
plt.legend()
plt.show()

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON
Time to visualize!
SPOKEN LANGUAGE PROCESSING IN PYTHON

Spoken Language Processing in Python Chapter1
No ratings yet
Spoken Language Processing in Python Chapter1
17 pages
Pydub
No ratings yet
Pydub
26 pages
Lecture
No ratings yet
Lecture
7 pages
Audio Signal Processing Basics
100% (1)
Audio Signal Processing Basics
55 pages
Python SpeechRecognition Guide
No ratings yet
Python SpeechRecognition Guide
23 pages
Digital Signal Processing Report
No ratings yet
Digital Signal Processing Report
20 pages
Spoken Language Processing in Python Chapter3
No ratings yet
Spoken Language Processing in Python Chapter3
26 pages
Sec 5 - Audio Signal Acquisition - Record & Load mp3
No ratings yet
Sec 5 - Audio Signal Acquisition - Record & Load mp3
9 pages
Sec 4 - Audio Signal Acquisition - Read&Write Wave - Plot
No ratings yet
Sec 4 - Audio Signal Acquisition - Read&Write Wave - Plot
12 pages
Reading Audio Data
No ratings yet
Reading Audio Data
8 pages
Audproc 2
No ratings yet
Audproc 2
40 pages
Sound Processing
No ratings yet
Sound Processing
9 pages
DSP Lab
No ratings yet
DSP Lab
44 pages
Sound Processing
No ratings yet
Sound Processing
22 pages
PCP Notes Speech Processing Jan08
No ratings yet
PCP Notes Speech Processing Jan08
35 pages
Audio Analysis in Python 1676006837
No ratings yet
Audio Analysis in Python 1676006837
5 pages
Notes
No ratings yet
Notes
46 pages
Speech Perception Lab Course Overview
No ratings yet
Speech Perception Lab Course Overview
3 pages
Speech Recognition
No ratings yet
Speech Recognition
5 pages
Speech Recognition UTHM
No ratings yet
Speech Recognition UTHM
30 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
10 pages
Voice Assistant Report
No ratings yet
Voice Assistant Report
4 pages
Audio Noise Detection
No ratings yet
Audio Noise Detection
29 pages
AudioComm PE Vaibhav
No ratings yet
AudioComm PE Vaibhav
14 pages
Voice Processing Tool
No ratings yet
Voice Processing Tool
51 pages
Text-Independent Speaker Recognition
No ratings yet
Text-Independent Speaker Recognition
12 pages
Introduction (UCS749)
No ratings yet
Introduction (UCS749)
72 pages
Final Year Project Progress Report
No ratings yet
Final Year Project Progress Report
17 pages
WAV File Processing Guide
No ratings yet
WAV File Processing Guide
24 pages
Signal Lab 3,4 2 PDF
No ratings yet
Signal Lab 3,4 2 PDF
7 pages
Lecture 16
No ratings yet
Lecture 16
23 pages
ASP Lab Report
No ratings yet
ASP Lab Report
8 pages
Python Audio Processing Guide
No ratings yet
Python Audio Processing Guide
4 pages
Speaker Recognition Matlab
No ratings yet
Speaker Recognition Matlab
24 pages
Aryan Raj ASP Aat
No ratings yet
Aryan Raj ASP Aat
9 pages
Eng 6 Audio Signals: Bevan Baas, Andre Knoesen
No ratings yet
Eng 6 Audio Signals: Bevan Baas, Andre Knoesen
30 pages
Digital Speech Processing - Lecture 1
No ratings yet
Digital Speech Processing - Lecture 1
39 pages
DSP Project 2
No ratings yet
DSP Project 2
10 pages
UrbanSound8K Dataset: Automatic Sound Recognition (ASR) Project With CNN and ANN Models
No ratings yet
UrbanSound8K Dataset: Automatic Sound Recognition (ASR) Project With CNN and ANN Models
31 pages
Acoustics of Speech: Julia Hirschberg CS 4706
No ratings yet
Acoustics of Speech: Julia Hirschberg CS 4706
30 pages
Spoken Language Processing in Python Chapter4
No ratings yet
Spoken Language Processing in Python Chapter4
46 pages
Theory and Application of Digital Speech Processing by L. R. Rabiner and R. W. Schafer
No ratings yet
Theory and Application of Digital Speech Processing by L. R. Rabiner and R. W. Schafer
35 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Audio Processing
No ratings yet
Audio Processing
19 pages
Lectures 7-8 Winter 2012
No ratings yet
Lectures 7-8 Winter 2012
73 pages
ECE3001Proj Part1
No ratings yet
ECE3001Proj Part1
2 pages
Homework 1
No ratings yet
Homework 1
3 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
9 pages
Speech Recognition Course Overview
No ratings yet
Speech Recognition Course Overview
2 pages
APPFDL
No ratings yet
APPFDL
9 pages
Basic Course Material Winter 2015
100% (1)
Basic Course Material Winter 2015
19 pages
Octave System Sound Processing Library: Lóránt Oroszlány
No ratings yet
Octave System Sound Processing Library: Lóránt Oroszlány
39 pages
Voice Assistant Report 40 Pages
No ratings yet
Voice Assistant Report 40 Pages
44 pages
4-2 Reading Wave Files: 本網頁根據 Chrome 測試，如果你不是使用 Chrome，可能無法正確呈現唷！
No ratings yet
4-2 Reading Wave Files: 本網頁根據 Chrome 測試，如果你不是使用 Chrome，可能無法正確呈現唷！
7 pages
Wavread in Matlab Sound File
No ratings yet
Wavread in Matlab Sound File
3 pages
CCS369 - TSS-Unit 5
No ratings yet
CCS369 - TSS-Unit 5
23 pages
UT Dallas Syllabus For hcs7367.501.09f Taught by Peter Assmann (Assmann)
No ratings yet
UT Dallas Syllabus For hcs7367.501.09f Taught by Peter Assmann (Assmann)
3 pages
Digital Pulse Counter Documentation
100% (1)
Digital Pulse Counter Documentation
28 pages
Instagram Shoutouts
No ratings yet
Instagram Shoutouts
5 pages
Python QB
No ratings yet
Python QB
2 pages
LPP Notes
No ratings yet
LPP Notes
11 pages
JCM2017 Extended Abstract Template
No ratings yet
JCM2017 Extended Abstract Template
2 pages
Dynamic Bayesian Networks for Topic Sentiment Analysis
No ratings yet
Dynamic Bayesian Networks for Topic Sentiment Analysis
11 pages
CRT Student Design Competitions
No ratings yet
CRT Student Design Competitions
4 pages
Activity 2
100% (1)
Activity 2
5 pages
Text Problems Solved
No ratings yet
Text Problems Solved
9 pages
CC - Creo ElementsDirect Solutions - Brochure - EN
No ratings yet
CC - Creo ElementsDirect Solutions - Brochure - EN
17 pages
Esp32 Documentatie
No ratings yet
Esp32 Documentatie
356 pages
Aci Certification Candidate Handbook 2021
No ratings yet
Aci Certification Candidate Handbook 2021
37 pages
Talend - Making ETL Easy
0% (1)
Talend - Making ETL Easy
21 pages
Grade 10 Term 3 Control Test Oct 1
No ratings yet
Grade 10 Term 3 Control Test Oct 1
3 pages
EKKO - View Enhanced and Deluxe 1-38
No ratings yet
EKKO - View Enhanced and Deluxe 1-38
37 pages
CSR Activities Of: Banglalink
100% (1)
CSR Activities Of: Banglalink
25 pages
Experiment #2: Continuous-Time Signal Representation I. Objectives
No ratings yet
Experiment #2: Continuous-Time Signal Representation I. Objectives
14 pages
2018 Alliance Enhancements: Quick Reference Guide
No ratings yet
2018 Alliance Enhancements: Quick Reference Guide
1 page
SMS User Manual v11.1
No ratings yet
SMS User Manual v11.1
1,107 pages
Datasheet LT1171HV
No ratings yet
Datasheet LT1171HV
20 pages
EXO3
No ratings yet
EXO3
2 pages
Seminar Topic On Pill Camera
No ratings yet
Seminar Topic On Pill Camera
20 pages
Online Hostel Management System
0% (1)
Online Hostel Management System
48 pages
Scientific Notation Explained
No ratings yet
Scientific Notation Explained
8 pages
14MTT72: Industrial Robotics (Responses)
No ratings yet
14MTT72: Industrial Robotics (Responses)
3 pages
RPCS3 - A PS3 Emulation Tutorial Guide - Nitroblog PDF
No ratings yet
RPCS3 - A PS3 Emulation Tutorial Guide - Nitroblog PDF
4 pages
Process Control Guide for Engineers
No ratings yet
Process Control Guide for Engineers
23 pages
Welding Simulation of Tubular Joint of Hsla S460 Using Simufact Welding Software
No ratings yet
Welding Simulation of Tubular Joint of Hsla S460 Using Simufact Welding Software
117 pages
Define A The Key Term "Data"
No ratings yet
Define A The Key Term "Data"
15 pages
06-Computer Maintenance Tool - Rev K
No ratings yet
06-Computer Maintenance Tool - Rev K
24 pages

Chapter 1

Uploaded by

Chapter 1

Uploaded by

Introduction to audio

Digital sounds measured in frequency (kHz)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Audiobooks and spoken language are between 8 and 16 kHz

We can't see audio files so we have to transform them first

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Import audio file as wave object

# Convert wave object to bytes

# View the wav file in byte form

SPOKEN LANGUAGE PROCESSING IN PYTHON

Small sample of audio = large amount of information

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Get the frame rate

Duration of audio file (seconds) = length of wave object array/frequency (Hz)

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Get the timestamps of the good morning sound wave

SPOKEN LANGUAGE PROCESSING IN PYTHON

array([0.00000000e+00, 2.08334167e-05, 4.16668333e-05, 6.25002500e-05,

SPOKEN LANGUAGE PROCESSING IN PYTHON

Same data transformations to all audio files

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

You might also like