Avoid storing empty files and show an error in Phonos
Closed, ResolvedPublic5 Estimated Story Points
Actions

Assigned To

Authored By

	MusikAnimal
	Dec 1 2022, 7:25 PM

Description

We've found that sometimes Phonos engines return empty audio files, presumably because the IPA passed to it was invalid or otherwise could not be interpreted. While some of that is expected, there's no reason to leave these empty files lingering around in Swift forever. Phonos should automatically delete the file and instead show a user-facing error that audio could not be generated.

Acceptance criteria

Don't store files that are very small (current threshold is 1200 bytes)
Show an error to the user, something like "The generated audio appears to be empty. The given IPA may be invalid, or the engine can't interpret it. Using the '$1' parameter may help."
- NOTE: It's not really safe to say the given parameters are definitively invalid; instead we just want to hint that it could be fixed by editorial trial-and-error.

Details

	Subject	Repo	Branch	Lines +/-
	Add a minimum file size and show an error if a file less than it	mediawiki/extensions/Phonos	master	+33 -5

Customize query in gerrit

Related Objects

Mentioned In: T322787: IPA can be optional only if a wikidata item or audio file is provided
rEPHN379653ad8505: GoogleEngine: don't remove parentheses from IPA input
rEPHN3b95fe4e208a: Add a minimum file size and show an error if a file less than it
Mentioned Here: T319379: Add option to not convert to MP3 via LAME
P42430 All(?) IPA phonemes google recognises in individual phonos tags

Event Timeline

MusikAnimal created this task.Dec 1 2022, 7:25 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 1 2022, 7:25 PM

MusikAnimal claimed this task.Dec 1 2022, 8:43 PM

MusikAnimal edited projects, added Community-Tech (CommTech-Sprint-37); removed Community-Tech.

MusikAnimal moved this task from Ready 🎬 to In Development 💻 on the Community-Tech (CommTech-Sprint-37) board.

MusikAnimal renamed this task from Automatically delete 0 byte files and show an error in Phonos to Automatically delete empty files and show an error in Phonos.Dec 1 2022, 9:48 PM

Change 863051 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/Phonos@master] Add a minimum file size and show an error if a file less than it

https://gerrit.wikimedia.org/r/863051

gerritbot added a project: Patch-For-Review.Dec 1 2022, 11:02 PM

See what others think of my solution. I got to coding an realized that we can probably safely deduce audio is empty (i.e. created but < 0:01 in length) solely by going by the byte length of the raw MP3 data, and not have to first create the file. This avoids unnecessary operations in Swift. An alternative is do like Extension:Score and use a script to get the length of the audio and go by that, but this would require a trip through Shellbox which we want to avoid if possible.

I tested many short words like "a", "hi", etc., and all are well over the minimum size I went with of 1200 bytes. For now, I've only done this for the Google engine since I know others (eSpeak in particular) generates shorter audio files.

MusikAnimal renamed this task from Automatically delete empty files and show an error in Phonos to Avoid storing empty files and show an error in Phonos.Dec 1 2022, 11:24 PM

MusikAnimal updated the task description. (Show Details)

For comparison, the first one here is empty and 1,056 bytes long.

• JMcLeod_WMF moved this task from Backlog to 🌟Top Priority on the MediaWiki-extensions-Phonos board.Dec 2 2022, 2:56 PM

Change 863051 merged by jenkins-bot:

[mediawiki/extensions/Phonos@master] Add a minimum file size and show an error if a file less than it

https://gerrit.wikimedia.org/r/863051

MusikAnimal mentioned this in rEPHN3b95fe4e208a: Add a minimum file size and show an error if a file less than it.Dec 2 2022, 3:50 PM

ReleaseTaggerBot added a project: MW-1.40-notes (1.40.0-wmf.13; 2022-12-05).Dec 2 2022, 4:00 PM

Maintenance_bot removed a project: Patch-For-Review.Dec 2 2022, 4:30 PM

QA notes: Examples of IPA that generate empty audio:

<phonos lang="en" ipa="ˈkɑːtɑːr, kəˈtɑːr" />
<phonos lang="ar" ipa="Ķuḍā'ī" />
<phonos lang="ar" ipa="foobar" />

Hopefully it's easy to create more examples. Basically just refrain from using the text parameter and give it bogus IPA (noting the first two examples above are not actually bogus, though!), it will often fail.

Another thing to be aware of is it's possible (but as-of-yet unproven) that actual playable files are 1200 bytes or smaller. In my testing, I looked at a lot of single-syllable words such as "a", "hi", etc., and none ever seemed to be close to 1200 bytes. So QA'ing might also involve trying to find would-be legitimate Phonos files that never get stored because it's so small.

MusikAnimal set the point value for this task to 5.Dec 6 2022, 12:16 AM

MusikAnimal mentioned this in rEPHN379653ad8505: GoogleEngine: don't remove parentheses from IPA input.Dec 6 2022, 1:28 PM

• JMcLeod_WMF edited projects, added Community-Tech (CommTech-Sprint-38); removed Community-Tech (CommTech-Sprint-37).Dec 6 2022, 1:38 PM

• JMcLeod_WMF moved this task from Ready 🎬 to QA 🐛 on the Community-Tech (CommTech-Sprint-38) board.Dec 6 2022, 1:39 PM

In T324239#8443393, @MusikAnimal wrote:

Another thing to be aware of is it's possible (but as-of-yet unproven) that actual playable files are 1200 bytes or smaller. In my testing, I looked at a lot of single-syllable words such as "a", "hi", etc., and none ever seemed to be close to 1200 bytes. So QA'ing might also involve trying to find would-be legitimate Phonos files that never get stored because it's so small.

The smallest I have found so far is 1440 bytes. Here is an example <phonos ipa="ɑ̃" lang=fr-ca />.

The largest "empty" file I have found so far is 1152 bytes (testing on the commit before this patch).

I extracted all(?) the IPA phonemes from https://cloud.google.com/text-to-speech/docs/phonemes and created a phonos tag for each (P42430).

@MusikAnimal Oh, just to clarify, we always get an mp3 from google? Per T319379. I am assuming that if we did get a WAV from google it would be bigger (even if empty?)

dom_walden mentioned this in T322787: IPA can be optional only if a wikidata item or audio file is provided.Dec 8 2022, 10:35 AM

In T324239#8450981, @dom_walden wrote:

@MusikAnimal Oh, just to clarify, we always get an mp3 from google? Per T319379. I am assuming that if we did get a WAV from google it would be bigger (even if empty?)

Correct, we always get MP3 from Google. For now, there's no minimum on the size of the files for the other engines, since I didn't bother to figure out what the appropriate value should be. eSpeak for instance makes very short audio files, so the threshold would need to be lower for it.

Restricted Application edited projects, added Community-Tech; removed Community-Tech (CommTech-Sprint-38). · View Herald TranscriptDec 9 2022, 10:19 PM

MusikAnimal edited projects, added Community-Tech (CommTech-Sprint-38); removed Community-Tech.Dec 9 2022, 10:22 PM

dom_walden moved this task from QA 🐛 to Product sign-off 🤘 on the Community-Tech (CommTech-Sprint-38) board.Dec 20 2022, 1:16 PM

• JMcLeod_WMF edited projects, added Community-Tech (CommTech-Sprint-39); removed Community-Tech (CommTech-Sprint-38).Jan 3 2023, 6:37 PM

• JMcLeod_WMF moved this task from Ready 🎬 to Product sign-off 🤘 on the Community-Tech (CommTech-Sprint-39) board.Jan 3 2023, 6:38 PM

• NRodriguez closed this task as Resolved.Jan 4 2023, 9:47 PM

• JMcLeod_WMF moved this task from Product sign-off 🤘 to Done 🏁 on the Community-Tech (CommTech-Sprint-39) board.Jan 16 2023, 2:43 PM

Avoid storing empty files and show an error in PhonosClosed, ResolvedPublic5 Estimated Story PointsActions

Description

Details

Related Objects

Event Timeline

Avoid storing empty files and show an error in Phonos
Closed, ResolvedPublic5 Estimated Story Points
Actions