Dictator

Dictator is a commandline program and GUI that allows you to dictate to your computer and have the text transcription be displayed in the terminal, ready to be copy and pasted wherever you may need it.

Under the hood Dictator use Whisper, the AI speech recognition model from OpenAI.

Requirements

Dictator is tested with:

Python 3.12
Ubuntu 22.04.5 LTS
Nvidia RTX 3060 12GB

Installation

# Clone the repo
git clone git@github.com:terzza/dictator.git

# Enter the repo
cd dictator

# Create a Python virtual environment (presuming you're using Python 3.12)
python3.12 -m venv env

# Activate the virtual environment
source env/bin/activate

# Upgrade pip
pip install -U pip

# Install the dependencies
pip install -r requirements.txt

Running

# Enter the repo
cd dictator

# Activate the virtual environment
source env/bin/activate

# Either start the CLI program
python dictator.py

# Or start the GUI program
python dictator_gui.py

Roadmap

Check Ubuntu system dependencies in a fresh installation
Expose configuration options via commandline arguments
Keep a log file of transcriptions
Optionally keep a copy of all recordings (might not actually be useful :-/)
Investigate PyAudio directly into Whisper without intermediary file
Find better solution to PyAudio stderr hack with sounddevice
Words per minute indicator (helpful to see how dictation might be faster that typing WPM)
Test sequence that can be run against the test wav file with all models and devices to give an indication of system speed

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dictator.py		dictator.py
dictator_gui.py		dictator_gui.py
requirements.txt		requirements.txt
test_dictator.py		test_dictator.py
test_news_headline.wav		test_news_headline.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dictator

Requirements

Installation

Running

Roadmap

About

Uh oh!

Uh oh!

Languages

License

terzza/dictator

Folders and files

Latest commit

History

Repository files navigation

Dictator

Requirements

Installation

Running

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages