Reassign shortcuts to multi-layer gestures (a single keyboard shortcut can perform different functions based on the number of times it is pressed) #16900

tarikhadzirovicofficial · 2024-07-20T17:38:01Z

tarikhadzirovicofficial
Jul 20, 2024

This will be a somewhat long text, so I will try to describe everything as best as I can.

Is your feature request related to a problem? Please describe.

The feature I would like to see in the NVDA screen reader, which I will describe today, is not directly related to a problem.
I got the idea while working on the Croatian translation for NVDA and correcting some errors. In a list of 3000 strings, I needed to find the exact string that describes the NVDA+2 shortcut in the input help and input gestures. Since I have the Typing Settings add-on, it has its own string for that, so I spent some time searching for the exact string. Therefore, I commented on #10302, hoping that someone would see that comment. So, the idea for this feature actually came from this problem.

Describe the solution you'd like

There are many add-ons that greatly facilitate working with the NVDA screen reader. One of them is Helper Scripts, which among other things, offers an enhanced script for viewing and managing the Windows clipboard. Thus, with a single press of the NVDA+C shortcut, NVDA announces the clipboard text; with a double press, it reads the text character by character; and with a triple press of the NVDA+C shortcut, a window with the clipboard text opens, which can then be viewed and edited.
It would be great if a feature could be implemented in the original NVDA that would allow assigning a single shortcut to one command but perform a different function depending on how many times the shortcut is pressed. We also covered this topic on the NVDA mailing list.
Here is an example of how this could look. Suppose we want to assign the sound split mode change feature to the NVDA+S shortcut on a double press and the sound ducking change feature on a triple press.
Open Input Gestures and find the "Speech" category, then the command "Cycles between speech modes". If we expanded that command, we would see that it has the NVDA+S shortcut assigned to it. If we press the Tab key once, we would see the "Remove" button, and if this feature were implemented, there should be a "Create Multi-Layer Gesture" button before the "Remove" button. When this button is activated, a dialog box with the title "Creating Multi-Layer Gesture" should appear. Meanwhile, nothing would change with the NVDA+S shortcut when pressed once. The first option should be a combobox determining whether it will be a double, triple, or quadruple press. After that, there should be a tree view of all commands that can be activated with some keyboard shortcut (the same list that is activated when opening input gestures). The selection of the command should be the same as, for example, selecting files. So, it would be enough to focus/select the command for which we want that double press, which we determined in the previous combobox, to apply. If we go further with the Tab key, there should be another combobox, this time displaying only two multi-layer commands that are available. So, if we set the first combobox to be a double press and selected the "Cycles through sound split modes" command in the list, the second combobox and the list of commands after it should exclude these two items since they are already in use. The same applies to the third combobox, except that this time the number of presses is reduced to one remaining, and the number of all available commands is reduced by two. Of course, it shouldn't be mandatory to fill all three fields, so if we don't want to have a quadruple press, we can choose nothing in the list for that combobox.
Now, what Luke Davis explained on the mailing list might present a potential problem. Read the full discussion to understand. So, here it should be determined whether the gesture is time-sensitive, meaning whether it needs to be executed immediately or not. When it comes to announcing text or generally some information that cannot affect the reader's work (such as announcing the current time and date, battery status, window title, etc.), it is not a problem because that information is ignored upon the second key press. If we had such behavior in this imaginary gesture, then with a double press, the speech mode would first change to "off" if, for example, only two speech modes are available as for me, and then the sound split mode would change. Of course, this is not the desired behavior. Therefore, the user should be able to determine whether the script should wait, for example, 500 milliseconds (in the case of text), which is negligible, or wait one second (or the user can determine how long) when it comes to more important features, those that affect the screen reader's work. Thus, if sensitivity is set to one second, a single press of the NVDA+S shortcut would turn off speech if we pressed the shortcut once. This would happen after that one second. However, if we pressed it twice quickly, then the script should also wait one second, but this time it should change the sound split mode since the NVDA+S shortcut was pressed twice. I believe this waiting cannot be avoided and is an integral part of such shortcuts. It must be present where shortcuts directly affect the screen reader's operation.

Conclusion

I don't know, I hope I was clear enough. Of course, I am here for all suggestions, comments, answers, whatever you are interested in, so that we can discuss together on this topic related to, in my opinion, a very interesting feature that I would like to see in NVDA someday.

josephsl · 2024-07-20T18:31:52Z

josephsl
Jul 20, 2024
Collaborator

Hi,

The way keyboard input works is a bit more complex than the explanation provided by @XLTechie:

When you press a keyboard command, the keyboard would send a scan code to the operating system. This is then intercepted by NVDA (input hook) and parsed. This is then followed by a script lookup, and if successful, the script counter is incremented (initially 0) and the command is executed. If another keyboard command comes within 0.5 seconds of the last script lookup, NVDA will see if the same keyboard command is being executed, incrementing the script counter if it locates the matching script to the one being executed.

What's missing in Luke's explanation (I'll comment on that thread too) is the time taken between key presses and actual script counter checks. Suppose you are typing on a wired (usually a USB) keyboard and you press NVDA+S to cycle through speech mode (using the example from the original comment). Here's what actually happens:

You press Insert and the S keys.
The keyboard controller (hardware) inside the keyboard recognizes the electric signals coming from the key presses. This introduces a delay of about nanoseconds to microseconds.
The signals from the individual keys are then combined and sent to the operating system through a combination of USB port and the keyboard driver. This takes microseconds.
The keyboard driver inside the operating system recognizes the code (usually called a scan code) from the keyboard and translates it into recognizable key codes for use by software. This takes nanoseconds.
NVDA somehow intercepts the keyboard input routine and does its own code translation. This takes nanoseconds to microseconds (microseconds because Python is an interpreted language).
Once NVDA translates keyboard input into a form it understands (into an input gesture), NVDA searches the gestures collection looking for a script to run (NVDA walks through gestures defined in the NVDA object that represents the focused control, browse mode and other documents if active, gestures defined by add-ons, and the default global commands map; all this takes up to several microseconds (to a human, this is an instant, but to a machine, a microsecond is a long time)).
If there is a match between keyboard gesture and a script, NVDA records the time the script was found, sets the script counter to 1, then runs the script (this too takes nanoseconds to microseconds).
This is repeated until NVDA steps through all gestures colection, and if there is no match, NVDA passes the key press to Windows, and hopefully handled by whatever program you are using or by Windows such as entering characters or moving from one app to another.
If the script matches and once the script is done, the script counter (for the one just executed) is reset to 0.
Somehow you press NVDA+S again. Steps 1 through 7 are performed, but this time NVDA notices that you wish to cycle speech modes. NVDA will check the time this script was performed, compare it with current time (to see if the script repeat window has passed), and increments the script counter if the key press was received within the repeat window (0.5 seconds). All of this takes time (up to microseconds; so the 0.5 second repeat window is an estimate at best because things happening inside hardware, the operating system, and NVDA takes time).
If no key presses were registered within the repeat window or you pressed a different key, NVDA performs other things such as handling events, reading text ranges, handling other input gestures, and so on.

Suppose the proposal is adopted. This changes the script repeat window and counter to a timer where NVDA can be told to wait for additional key presses. As Luke explained in the linked NVDA thread, this can create perception of "performance issues" when in fact NVDA is running input gesture timers. Most importantly, because NVDA must wait for timers to complete (either because the timer has expired or a different input gesture came along), key presses not associated with NVDA will be delayed.

The case above is for wired keyboards. Imagine working with wireless keyboards, and this introduces additional complexities and time as not only the keyboard driver must transform key press signals into code understandable by Windows (and eventually by NVDA), it must do so using whatever wireless protocol the keyboard is using (Bluetooth, for example). A possibly more complex case is remote systems where there will be a delay between a key press received by the local system, sent over a network, then translated by the remote system, then result sent back to the local system.

Hope this helps.

Thanks.

0 replies

tarikhadzirovicofficial · 2024-07-23T12:18:07Z

tarikhadzirovicofficial
Jul 23, 2024
Author

Hi @josephsl!
Thank you for the detailed explanation of how keyboard input works and its interaction with NVDA. I appreciate the effort and time put into this explanation.
Given the complexity and potential challenges you described, I am wondering if there is a possibility of a compromise or solution that would satisfy both users and technical requirements? I'm not a programmer, so I don't know. As an ordinary user, functionality without significant delays or performance issues is important to me.
Is there perhaps an alternative or adjustments that would allow for an optimal NVDA experience while minimizing delay and performance problems?

0 replies

gerald-hartig · 2024-08-02T04:55:09Z

gerald-hartig
Aug 2, 2024
Maintainer

We already support some multi-layer gestures in NVDA, right? Eg with NVDA+control+r once to reset to recently saved settings and three times to reset NVDA to factory defaults.

0 replies

XLTechie · 2024-08-02T06:35:25Z

XLTechie
Aug 2, 2024
Collaborator

@gerald-hartig, What @tarikhadzirovicofficial is requesting, is the ability to re-assign what pressing NVDA+Ctrl+R twice or three times does, just as you can set what pressing the gesture once does. Even though the title of this issue is a bit incorrect to what is being asked, it is clear that what is wanted, is the ability to control what scripts repeated key presses call. For example, right now, pressing NVDA+Ctrl+R, calls the same script whether you press it once, or three times. That script then determines what to do, based on the number of presses. Actually, it does all three actions, from least destructive to most, but the user only notices the final one. @tarikhadzirovicofficial is asking for the ability, in this example, to change what script pressing NVDA+Ctrl+R three times calls, to, e.g., Toggle Audio Ducking. It would require moving the repeat key detection logic out of the individual scripts, and into the inputCore.. For that reason alone, it is highly unlikely that such a large refactor would take place. The only way to have any hope of making it work, would be to change the input subsystem, so that if any gesture has something assigned on its second, third, or fourth press, pressing it once would wait the (now configurable) given timeout, before calling the script assigned for pressing once. It could not safely run the one time press action, and then the two time press action on top of it, etc., like is done now, because we could never trust that users understand not to assign destructive commands earlier in the press count. This is possible, but both Joseph and I have tried to explain that it is technically unwise for several reasons. I also think that most users wouldn't like the delays that would be introduced for single press commands that also have double press options.

1 reply

gerald-hartig Aug 4, 2024
Maintainer

Thanks for the clarification @XLTechie, I've updated the title of the discussion to match.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reassign shortcuts to multi-layer gestures (a single keyboard shortcut can perform different functions based on the number of times it is pressed) #16900

{{title}}

Replies: 4 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Reassign shortcuts to multi-layer gestures (a single keyboard shortcut can perform different functions based on the number of times it is pressed) #16900

tarikhadzirovicofficial Jul 20, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Conclusion

Replies: 4 comments · 1 reply

josephsl Jul 20, 2024 Collaborator

tarikhadzirovicofficial Jul 23, 2024 Author

gerald-hartig Aug 2, 2024 Maintainer

XLTechie Aug 2, 2024 Collaborator

gerald-hartig Aug 4, 2024 Maintainer

tarikhadzirovicofficial
Jul 20, 2024

Replies: 4 comments 1 reply

josephsl
Jul 20, 2024
Collaborator

tarikhadzirovicofficial
Jul 23, 2024
Author

gerald-hartig
Aug 2, 2024
Maintainer

XLTechie
Aug 2, 2024
Collaborator

gerald-hartig Aug 4, 2024
Maintainer