Skip to content

[css-speech-1] pause-before should not apply in most contexts #4870

@cookiecrook

Description

@cookiecrook

The original PWFG feedback on CSS 3 Speech from 2011 included this comment about pause-before

We are also concerned that end users will interpret correct implementation of these properties as a severe performance lag. For example, if a user were forced to wait 2 seconds between each heading, the experience would be tedious for TTS users comfortable with machine speech at rates pushing 400 words per minute.

But the CSS WG rejected that comment from the W3C's cross-functional accessibility review group, listing a bulk acceptance (by @michael-n-cooper) of the rejections. However, as I read the resolution, it appears that the acceptance was to reject removing the properties, but add the following guidance, among other notes.

If you plan to keep this property, we suggest the following:
[...snip...]
Unequivocally declare that implementors should ignore pause-before values when navigating to an element in the screen reader context, so as to not create the perception of performance lag. e.g., If a screen reader user presses the command to "jump to next heading," speak it immediately. Ignore pause-before immediately after a focus change.

But those notes were never added prior to publishing CSS 3 Speech.

That appears to have been an oversight or miscommunication, so I'm re-raising this as a blocking issue for the republish of CSS 3 Speech to CSS Speech 1, with the additional context below.

pause-before should not apply at all in certain circumstances, depending on how the user got to the element. For example, if a screen reader user performs the keypress for “next heading”, they should hear the speech immediately without delay. Trimming leading silence is somewhat analogous to trimming leading whitespace.

Some screen reader users notice and start to be annoyed if a time-to-utterance delay (leading silence) is greater than 40ms. Most daily screen reader users would notice the delay at about 80–100ms. So allowing page authors to specify delays of several seconds does not make sense in the context where the screen reader user or speak-on-hover user is actively navigating.

There are some circumstances where gaps between concatenated utterances in a single rendering (e.g. pauses between phrases in an ebook or “read all” context), but because the spec is focused on linear generated audio rather than speech usage in general, it doesn’t adequate represent the contexts where features like pause-before should not apply.

Metadata

Metadata

Assignees

No one assigned

    Labels

    a11y-needs-resolutionIssue the Accessibility Group has raised and looks for a response on.css-speech-1

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions