Skip to content

PixelSonic - Text and Speech to FSK Sound Conversion Project PixelSonic is an innovative digital communication project that demonstrates the conversion of text and speech into Frequency-Shift Keying (FSK) modulated sound waves. The project showcases how digital information can be encoded into analog audio signals, transmitted, and then decoded.

Notifications You must be signed in to change notification settings

Ujjansh05/PixelSonic

Repository files navigation

PixelSonic

This project demonstrates the conversion of text and speech into Frequency-Shift Keying (FSK) modulated sound waves and back. It showcases how digital information (text) can be encoded into an analog audio signal, transmitted, and then decoded back into its original form. The project also integrates speech recognition to capture spoken words and convert them into FSK sound.


🚀 Features

  • Text to FSK Sound: Converts any given string of text into an audible FSK-modulated audio signal.
  • Speech to FSK Sound: Uses a microphone to capture spoken words, converts them to text, and then generates the corresponding FSK sound.
  • FSK Demodulation: Decodes the FSK audio waveform back into the original text.
  • Educational: Provides a clear, practical example of digital-to-analog signal modulation and demodulation principles.

⚙️ How It Works

The core of this project lies in the principles of Frequency-Shift Keying (FSK).

  1. Text to Binary: The input text is first converted into its binary representation using 8-bit ASCII encoding.

  2. FSK Modulation: A digital signal is generated by assigning two distinct frequencies to the binary digits:

    • Bit '0': Represented by a sine wave at a lower frequency (2000 Hz).
    • Bit '1': Represented by a sine wave at a higher frequency (4000 Hz).

    The code generates a continuous audio waveform by concatenating these sine waves for each bit in the binary sequence.

  3. Audio Playback: The generated waveform is played as sound using the sounddevice library.

  4. FSK Demodulation (Decoding): To convert the sound back to text, the script performs the following steps:

    • It processes the audio waveform in small segments, each corresponding to a single bit.
    • For each segment, it uses the Hilbert transform and the zero-crossing rate to determine the dominant frequency.
    • Based on whether the frequency is closer to 2000 Hz or 4000 Hz, it decodes the segment as a '0' or a '1'.
    • The resulting binary string is then converted back to ASCII text.

📦 Dependencies

To run this project, you need to have Python 3 installed, along with the following libraries:

  • numpy
  • sounddevice
  • scipy
  • librosa
  • torch
  • soundfile
  • SpeechRecognition

You can install them using pip:

pip install numpy sounddevice scipy librosa torch soundfile SpeechRecognition

You may also need to install PortAudio for the sounddevice library to function correctly.

Error Correction: Implement error-checking and correction codes (like parity bits or checksums) to make the transmission more robust against noise.

GUI: A simple graphical user interface could be built to make the application more user-friendly.

Improved Demodulation: The current frequency detection is basic. A more robust method using a Fast Fourier Transform (FFT) on each segment would provide more accurate demodulation.

Web Page view

pixel2 pixel

About

PixelSonic - Text and Speech to FSK Sound Conversion Project PixelSonic is an innovative digital communication project that demonstrates the conversion of text and speech into Frequency-Shift Keying (FSK) modulated sound waves. The project showcases how digital information can be encoded into analog audio signals, transmitted, and then decoded.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published