Skip to content

acarter881/google_tts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

google_tts

Using Google's Cloud Text-to-Speech API

Table of Contents

FAQ

  1. What is this API?

Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 100+ voices, available in multiple languages and variants. It applies DeepMind’s groundbreaking research in WaveNet and Google’s powerful neural networks to deliver the highest fidelity possible. As an easy-to-use API, you can create lifelike interactions with your users, across many applications and devices.

  1. Does the API cost money to use?
  1. What voices can I choose from?
  1. Can I use Speech Synthesis Markup Language (SSML)?
  • Yes, please reference this link and see the example code below
<speak>
  Here are <say-as interpret-as="characters">SSML</say-as> samples.
  I can pause <break time="3s"/>.
  I can play a sound
  <audio src="https://www.example.com/MY_MP3_FILE.mp3">didn't get your MP3 audio file</audio>.
  I can speak in cardinals. Your number is <say-as interpret-as="cardinal">10</say-as>.
  Or I can speak in ordinals. You are <say-as interpret-as="ordinal">10</say-as> in line.
  Or I can even speak in digits. The digits for ten are <say-as interpret-as="characters">10</say-as>.
  I can also substitute phrases, like the <sub alias="World Wide Web Consortium">W3C</sub>.
  Finally, I can speak a paragraph with two sentences.
  <p><s>This is sentence one.</s><s>This is sentence two.</s></p>
</speak>
  1. Should I use a standard voice or a WaveNet voice?
  • I prefer to use a WaveNet voice. You can read about the differences here.
  1. Can I get a custom voice?

Sample Output

More Information

About

Using Google's Cloud Text-to-Speech API

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages