VoxGem

This is a contribution under My Summer Internship with Build Fast with AI

Inspiration

We all love background soundtracks while we work - be it lo-fi music, horror movie explanations, or podcasts. And lately, podcasts have become more than just entertainment - they’re stories, experiences, raw human emotions shared through voice. This sparked the curiosity "Can I simulate that podcast magic using AI?", and then rest is History:

What it does

It takes any novel, site, pdf or any readable thing that you want and with it's magic turn the borning reading session into an exciting Podcast with much more refined info !! With just 3-click, the heavy page is shinked into a 5-min long Podcast with all major and relevant information.

Tech Stack Used:

Firecrawl : For Scrapping Data
Gemini-1.5-pro : For Script and text generation
Gemini-2.5-flash-preview-tts : For Multimodel Voice Interaction
Vanilla HTML + CSS : For Frontend
Wavesufer.js : For Audio/Voice Representation

All Import files : requiments.txt

pip install requiments.txt

requests
markdown
beautifulsoup4
google-generativeai
regex
python-dotenv

INSTALLATION:

To run VoxGem locally, follow these steps:

On the GitHub page for this repository, click on the button "Fork."
Clone your forked repository to your computer by typing the following command in the Terminal: git clone https://github.com/<your-github-username>/VoxGem---Podcast-Buddy.git
Navigate into the cloned repository by typing the following command in the Terminal: cd /VoxGem---Podcast-Buddy
Selected your prefered URL and Run the "content_scrapping.py" to scrap the website content.
Run the "script_generation.py" to generate the script from the content just downloaded.
Run the "podcast.py" to generate the .wav file containing the interaction between our Voice Models.
Run the "index.html" as "Show Preview" to hear the amazing Podcast with a clean UI.

or

Use the google collab file: https://colab.research.google.com/drive/1kj1-OKoCOBW4SDkDSeG3VQv7RAgE56DB?usp=sharing to run and test directly the outcomes !!

Video Tutorial:

https://drive.google.com/file/d/1c4Lzn_AvDzXaMxJUk3mQo1X1stDHMdwM/view?usp=sharing

What's next for VoxGem

Adding multi-language feature.
Adding more and more voice agents to make the discusion more in-dept and broader.
Refining the UI Set-up a bit more clean and attractive.
Adding an Interaction path for real-world user to interact with the "Guest" model to ask any question releated to the topic.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Audio		Audio
Scripts		Scripts
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
VoxGem.ipynb		VoxGem.ipynb
content_scrapping.py		content_scrapping.py
index.html		index.html
podcast.py		podcast.py
requiments.txt		requiments.txt
script_generation.py		script_generation.py
voxgem_podcast.wav		voxgem_podcast.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoxGem

Inspiration

What it does

Tech Stack Used:

All Import files : requiments.txt

INSTALLATION:

To run VoxGem locally, follow these steps:

or

Video Tutorial:

What's next for VoxGem

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoxGem

Inspiration

What it does

Tech Stack Used:

All Import files : requiments.txt

INSTALLATION:

To run VoxGem locally, follow these steps:

or

Video Tutorial:

What's next for VoxGem

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages