Define TTS enhancements in a working group note #1700

mattgarrish · 2021-06-11T12:35:33Z

This PR fixes #1690 but doesn't change anything from a validation perspective. It is still valid to link to PLS lexicons, and fallbacks are not required for linked resources. All it does is stop promoting the practice given the lack of real-world support.

The reading system support requirements are unchanged except for a couple of references to the core specification that now link to the PLS specification and the RS processing section, as appropriate. It might be worth taking PLS out of this document, too, as if we ever do get support it would seemingly have to come through browsers.

~~If authoring support ever develops, this is probably something we could better take up in the accessibility techniques (same for CSS 3 Speech).~~

Due to changes in direction resulting from further discussion of #1690, this pull request now adds a new note that covers TTS enhancements in EPUB (SSML+PLS+CSS Speech). These technologies are still valid to use in epub publications.

Please refer to #1700 (comment) and onwards in this thread for the current discussion.

Fixes #1690
Fixes #1712

Preview | Diff

…e rs processing section

iherman

Is it o.k. to leave the pls reference in the example §2.3.2.4.2? It may be legal but now it is there out of the blue...

(As an editorial aside: that example does not look like an example... I guess it should be an <aside> rather than a <div>)

mattgarrish · 2021-06-11T13:59:14Z

I guess it should be an <aside> rather than a <div>

Hm, I hadn't noticed that respec drops the formal numbered heading when a div is used. I fixed this one, but there are a lot of others. Looks like we should only use div when the example is integrated with the surrounding text.

iherman · 2021-06-11T14:04:44Z

I guess it should be an <aside> rather than a <div>

Hm, I hadn't noticed that respec drops the formal numbered heading when a div is used. I fixed this one, but there are a lot of others. Looks like we should only use div when the example is integrated with the surrounding text.

Maybe let us leave this as for this PR, and come back to that later...

mattgarrish · 2021-06-11T14:32:48Z

Maybe let us leave this as for this PR, and come back to that later...

Ya, I'm more wondering if we should dump the RS processing requirements. I'm sure if anyone tried to implement them they'd find them vastly underspecified.

I don't even know what it means for a reading system to "apply the supplied pronunciation instructions" to the text nodes. The problem is more complicated than that, as it assumes the reading system is doing the voicing. It's more likely that a built-in OS voicing technology or an AT accessing the DOM is going to send the text to a TTS engine be rendered, and it would have to inject any author-defined phonemes at that stage.

So the few rules that we have don't even target the right application. The voicing application would only gain knowledge of a PLS lexicon from the link.

iherman · 2021-06-11T14:51:13Z

Good point. There is no reason to have anything about PLS in the Reading system if the content document does not even mention it...

iherman · 2021-06-11T16:13:40Z

I wonder whether it is worth adding this (informative) reference to the document:

https://www.w3.org/TR/spoken-html/

It maybe in a very early stage right now, but may be in a much better shape by the time we get to rec...

mattgarrish · 2021-06-11T17:50:29Z

I wonder whether it is worth adding this (informative) reference to the document:

The formulation of SSML created in IDPF is similarly fraught, but is probably a separate issue.

That document skims past the same lack of standardization on how to include PLS in html and assumes it can be done. That's where our definition of using link elements might better belong.

The use of data-* attributes at this stage is also problematic. Until they move to a more viable proposal, we may want to hold off on citing. It could be misinterpreted as an actual extension.

murata2makoto · 2021-06-11T23:33:42Z

Let me check. I would not be surprised if PLS is heavily used by at least one textbook publisher in Japan. I know that it heavily uses SSML.

mattgarrish · 2021-06-12T00:25:06Z

I would not be surprised if PLS is heavily used by at least one textbook publisher in Japan.

The change is not that different from CFIs in content. You can still use those, too, even if the spec doesn't actively promote them, and reading systems can support them in content if they want. Similarly, we don't invalidate anything anyone has done, or will continue to do, as we were basically just documenting how to use the link tag for one type of resource. Reading systems don't require the spec to say anything to choose to support lexicons, either.

The ssml attributes, on the other hand, are solely defined by our specification, so they're here to stay for the duration in some form. It's premature to look at the WAI group's work yet, but if it gains better traction than our own attributes we'll need to address migration eventually. It looks like it may end up with very similar, but likely unprefixed, attributes, which would help in that.

I don't see it's something we'll be able to address in this revision, though. After getting burned by aria-describedat, I'm hesitant about talking up any new technologies in the specification that aren't far along the REC track, even if it's only in a note.

murata2makoto · 2021-06-12T03:56:01Z

@mattgarrish

PLS is not broken and is widely used. SSML is used even more. Why do we have to touch SSML and PML, when they are not broken and there are no mature alternatives ?

mattgarrish · 2021-06-12T12:13:00Z

PLS is not broken and is widely used.

Do you have any evidence that PLS is widely used and supported in EPUB?

At any rate, CSS 3 Speech is also not mentioned in the specification as there's nothing the specification needs to add to make it valid. We don't need to document every thing you can do.

EPUB is not where these technologies should have been defined for HTML, either. As I said at the last meeting, if we're serious about getting traction for PLS, we should look at getting the section incorporated into the spoken presentation specification or publishing it as a separate note (perhaps in the CG, like Alternate Style Sheets). It shouldn't be buried in the EPUB spec.

I also think this is better documented in the accessibility techniques document, because, as you say, this is more best practice guidance for making content accessible (i.e., meeting WCAG 3.1.6).

okayama247 · 2021-06-15T13:59:19Z

Do you have any evidence that PLS is widely used and supported in EPUB?

Yes, Japanese textbook companies use SSML as a pronunciation processing method for digital textbooks, and it seems that PLS is used as a dictionary function. I'm currently checking to textbook companies.
In addition, the Ministry of Internal Affairs and Communications of Japan has established guidelines for producing e-books that support accessibility by reading aloud. This guideline stipulates how to use PLS.

Here is an example of a site where PLS is listed. (Japanese only).
*https://www.lentrance.com/features/
*https://www.itrc.net/contents/wg/uat/2016-5.pdf
*https://www.soumu.go.jp/main_content/000354698.pdf

into editorial/issue-1690 # Conflicts: # epub33/core/index.html

mattgarrish · 2021-06-18T20:51:55Z

I've created the TTS note now. You can preview it at: https://cdn.statically.io/gh/w3c/epub-specs/editorial/issue-1690/epub33/tts/index.html

I've added some introductory text around what we had, but I've otherwise kept the requirements as they were. Fixes for the other issues I've opened can be done after this gets merged.

The pull request now also modifies the reading systems specification: preview, diff

into editorial/issue-1690

murata2makoto · 2021-06-18T21:03:57Z

@mattgarrish

Thanks. Should I make comments on the new note here? Or, should I create a separate issue?

mattgarrish · 2021-06-18T21:28:14Z

If it's about how we fix/improve the authoring/rs requirements, I'd say open new issues for those so we can tackle them separately and it'll be clearer what is changing. That's what I've been trying to do.

If it's just additions/clarifications you'd like to see in any of the intro text I've added to fill out the document as a more complete note, you can probably add them here.

okayama247 · 2021-06-19T03:07:18Z

Matt-san, Thank you for some comments and suggestions.

I have confirmed PLS with some Japanese textbook companies that appear to be using SSML.

Strictly speaking, textbook companies do not use PLS, but they refer to PLS notation and create a TTS dictionary for the reading system to control SSML.
On the other hand, according to Murata-san's information, it seems that the EPUB leader company is adopting PLS.

Digital textbooks must be created so as not to make mistakes in reading. Therefore, textbook companies use full-text SSML.
Maybe digital textbooks use SMIL to synchronize sentences with pronunciation.

Japanese sentences are mixed with kana and kanji (sometimes mixed with foreign languages), so the reading (pronunciation) has the characteristic of changing depending on the structure of the sentence. In Japanese, the most important factor is how to pronounce the characters correctly.

In the case of textbooks, full-text SSML is currently used, but if TTS's technical capabilities improve in the future, partial SSML will pronounce it correctly. Therefore, I think it is better to keep the PLS notation in the future.

murata2makoto · 2021-06-19T03:30:01Z

@okayama247

We are heading for a separate note for PLS+SSML+CSS Speech. Of course, there are pros and cons about this decision. But it is now easier to make some improvements. If there are some low-hanging fruits, please submit a proposal.

okayama247 · 2021-06-19T07:25:07Z

Murata-san, I got it.

mattgarrish · 2021-06-19T11:13:17Z

but they refer to PLS notation and create a TTS dictionary for the reading system to control SSML.

Yes, I think we can improve our requirements to make it more obvious that this makes for a conforming implementation.

We should only require the correct phonemes be applied independent of how it is done. We could then informatively suggest some known ways, like initializing the TTS engine with the lexicons (if it supports PLS), compiling the lexemes and applying the phonetic spellings to the text passed to the TTS engine, or transforming the PLS file into a format the TTS engine can recognize.

But we should take these issues up separately from creating the initial note. It'll be more helpful for a change log to have separate issues and pull requests we can refer to.

…prove the background section

mattgarrish · 2021-06-21T14:47:51Z

If there are no other editorial issues, I'll merge this by end of day tomorrow so we can move to fixing up the requirements.

iherman · 2021-06-22T12:02:21Z

@mattgarrish just reading through the tts draft: the text speaks about XHTML only, although we found out that everything is reproducible in SVG, too. It is a note, and RS do not seem to implement TTS in general, so it does not harm if we add SVG alongside HTML.

mattgarrish · 2021-06-22T12:40:09Z

It is a note, and RS do not seem to implement TTS in general, so it does not harm if we add SVG alongside HTML.

Right, I just want to make any substantive changes after we break out the tts note so that we're not changing too much at once. Also so we can directly tie a change log to specific pull requests.

Given there's been no other feedback, though, I'm going to merge this and then make the svg changes for #1710.

Define TTS enhancements in a working group note

mattgarrish added 2 commits June 11, 2021 09:10

remove pls lexicon section and remove entry from core media types table

2dbf73f

change references to core pls section to the pls specification and th…

f34029f

…e rs processing section

mattgarrish requested review from avneeshsingh, dauwhe, iherman, shiestyle and wareid June 11, 2021 12:35

iherman approved these changes Jun 11, 2021

View reviewed changes

mattgarrish added 2 commits June 11, 2021 10:53

remove pls file from manifest cmt example

cd28a2b

s/div/aside/

9b9c3f7

remove pls lexicon support requirements for reading systems

e7fb902

iherman mentioned this pull request Jun 17, 2021

Remove PLS section from authoring specification #1690

Closed

mattgarrish changed the title ~~Remove PLS lexicon section~~ Define TTS enhancements in a working group note Jun 17, 2021

Merge branch 'main' into editorial/issue-1690

12564f1

iherman mentioned this pull request Jun 18, 2021

Regrouped CSS and Scripting subsections in section 3 #1709

Merged

mattgarrish added 3 commits June 18, 2021 17:10

move ssml and css speech requirements to the new tts note

615b4a3

Merge branch 'editorial/issue-1690' of https://github.com/w3c/epub-specs

5afac77

into editorial/issue-1690 # Conflicts: # epub33/core/index.html

Merge branch 'main' into editorial/issue-1690

bbee8b9

mattgarrish added 2 commits June 18, 2021 17:57

fix bad id plus minor typos

7ee8b72

Merge branch 'editorial/issue-1690' of https://github.com/w3c/epub-specs

31f3b3d

into editorial/issue-1690

mattgarrish added 2 commits June 19, 2021 08:47

update intro to clarify pls and ssml can be used independently and im…

cc1c614

…prove the background section

minor tweak to pls intro to focus less on ssml

2fdc027

mattgarrish added 3 commits June 21, 2021 14:53

add examples

5620f74

fix status to ed

9cca3df

updated examples

f1ddbfb

mattgarrish merged commit ac29474 into main Jun 22, 2021

mattgarrish deleted the editorial/issue-1690 branch June 22, 2021 12:40

iherman pushed a commit that referenced this pull request Jun 23, 2021

Merge pull request #1700 from w3c/editorial/issue-1690

dee5fc6

Define TTS enhancements in a working group note

Define TTS enhancements in a working group note #1700

Define TTS enhancements in a working group note #1700

Uh oh!

Conversation

mattgarrish commented Jun 11, 2021 • edited by pr-preview bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iherman left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattgarrish commented Jun 11, 2021

Uh oh!

iherman commented Jun 11, 2021

Uh oh!

mattgarrish commented Jun 11, 2021

Uh oh!

iherman commented Jun 11, 2021 via email

Uh oh!

iherman commented Jun 11, 2021

Uh oh!

mattgarrish commented Jun 11, 2021

Uh oh!

murata2makoto commented Jun 11, 2021

Uh oh!

mattgarrish commented Jun 12, 2021

Uh oh!

murata2makoto commented Jun 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattgarrish commented Jun 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

okayama247 commented Jun 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattgarrish commented Jun 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

murata2makoto commented Jun 18, 2021

Uh oh!

mattgarrish commented Jun 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

okayama247 commented Jun 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

murata2makoto commented Jun 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

okayama247 commented Jun 19, 2021

Uh oh!

mattgarrish commented Jun 19, 2021

Uh oh!

mattgarrish commented Jun 21, 2021

Uh oh!

iherman commented Jun 22, 2021

Uh oh!

mattgarrish commented Jun 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mattgarrish commented Jun 11, 2021 •

edited by pr-preview bot

Loading

iherman left a comment •

edited

Loading

murata2makoto commented Jun 12, 2021 •

edited

Loading

mattgarrish commented Jun 12, 2021 •

edited

Loading

okayama247 commented Jun 15, 2021 •

edited

Loading

mattgarrish commented Jun 18, 2021 •

edited

Loading

mattgarrish commented Jun 18, 2021 •

edited

Loading

okayama247 commented Jun 19, 2021 •

edited

Loading

murata2makoto commented Jun 19, 2021 •

edited

Loading