Skip to content

Add OSC 8 support#2243

Merged
CatsDeservePets merged 15 commits into
gokcehan:masterfrom
CatsDeservePets:OSC
Nov 3, 2025
Merged

Add OSC 8 support#2243
CatsDeservePets merged 15 commits into
gokcehan:masterfrom
CatsDeservePets:OSC

Conversation

@CatsDeservePets

@CatsDeservePets CatsDeservePets commented Oct 31, 2025

Copy link
Copy Markdown
Collaborator

Given @joelim-work's requirements:

  1. Detecting a terminal sequence
  2. Parsing that terminal sequence and converting it to a Tcell style object

applyAnsiCodes only does the second step after confirming that it is of the form ESC[ ... m.

I think to keep the code modular, we should do the following:

  • Create a new utility function which takes a string starting with ESC and returns the terminal sequence (or a blank string). The length can be used for the printLen and stripAnsi functions to know how much input to skip.
  • In the win.print function, the terminal sequence should then be converted to a Tcell style object. The parsing can be split into two different functions (one for SGR, and another for OSC), or kept as one big function that handles both.

BTW I'm not working on any of this, if you are interested then feel free to have a go.

Before After

Some thoughts:

  • I don't like that stripAnsi works on both CSI and OSC while applyAnsiCodes only works on CSI. That could lead to confusion.
  • Is there any point in having parseEscapeSequence?
  • What about the C1 versions mentioned in here?

@joelim-work joelim-work left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I just tested the following lfrc file and it works:

echo "\033]8;;https://example.com\033\\example.com\033]8;;\033\\"

About your comments:

  • I don't like that stripAnsi works on both CSI and OSC while applyAnsiCodes only works on CSI. That could lead to confusion.

I think the functions can be renamed:

  • stripAnsi -> stripTermSequence
  • applyAnsiCodes -> applySGR
  • Is there any point in having parseEscapeSequence?

I guess this is now basically just applyTermSequence except that the base style is the default style. Generally terminal sequences like this modify an existing style, not create a new one. It should be fine to remove parseEscapeSequence and just use applyTermSequence only.

  • What about the C1 versions mentioned in here?

It's probably fine to leave it for now, processing terminal sequences can be a pain to deal with for ones that are not so standardized. I'm not even sure how necessary it is to handle BEL either, but I'm not too fussed either way.

One other thing - I understand this is just a draft for now, but you should add unit tests for this later on.

@veltza

veltza commented Oct 31, 2025

Copy link
Copy Markdown
Contributor

I also tested this feature and noticed that the current implementation does not preserve params in hyperlinks. In particular, the id parameter is important so that terminals can highlight all hyperlinks with the same id. You can try this in Bash and see how your terminal highlights them together:

printf '\e]8;id=1;http://example.com\e\\The first part of the link\e]8;;\e\\\n'
printf '\e]8;id=1;http://example.com\e\\and the second part of the link.\e]8;;\e\\\n'

osc8_1

There's also a quirk in the feature that highlights words when the mouse is over a word, and spaces when the mouse is over them. It gives the impression that there are two hyperlinks instead of one:

printf '\e]8;;http://example.com\e\\This  is  a  link\e]8;;\e\\\n'

osc8_2a

osc8_2b

This can be replicated with Alacritty, Contour and Xfce terminals, but not with Kitty and Wezterm.

Despite these issues, this looks very promising!

@CatsDeservePets

CatsDeservePets commented Oct 31, 2025

Copy link
Copy Markdown
Collaborator Author

Hey @veltza!

Thank you so much for testing this out and leaving some feedback!

For testing, I mostly use this example file. I noticed that even outside lf, behaviour varies greatly between iTerm, kitty and ghostty (using cat hyperlink-demo.txt).

I found it rather difficult to differentiate between mismatches caused by incorrect parsing on my side and different implementations on a terminal level. However, I will look deeper into this.

Edit: I noticed you were also involved in testing the new preloading feature. Thanks once again for taking your time to test new features and patches, it really helps a lot! ❤️

@veltza

veltza commented Oct 31, 2025

Copy link
Copy Markdown
Contributor

For testing, I mostly use this example file. I noticed that even outside lf, behaviour varies greatly between iTerm, kitty and ghostty (using cat hyperlink-demo.txt).

I also used that test file when I implemented OSC-8 support in my st fork. The file includes so many edge cases that I didn’t bother handling all of them either. So I recommend that you don't spend too much time on them either, because applications use fairly simple hyperlinks anyway.

@CatsDeservePets

CatsDeservePets commented Oct 31, 2025

Copy link
Copy Markdown
Collaborator Author

I also tested this feature and noticed that the current implementation does not preserve params in hyperlinks. In particular, the id parameter is important so that terminals can highlight all hyperlinks with the same id. You can try this in Bash and see how your terminal highlights them together:

printf '\e]8;id=1;http://example.com\e\\The first part of the link\e]8;;\e\\\n'
printf '\e]8;id=1;http://example.com\e\\and the second part of the link.\e]8;;\e\\\n'

This is fixed now!

There's also a quirk in the feature that highlights words when the mouse is over a word, and spaces when the mouse is over them. It gives the impression that there are two hyperlinks instead of one:

printf '\e]8;;http://example.com\e\\This  is  a  link\e]8;;\e\\\n'

This can be replicated with Alacritty, Contour and Xfce terminals, but not with Kitty and Wezterm.

Not sure why this happens, cross-terminal stuff is a real pain (as everyone here seems to be aware of).

OK, this is related to the id as well (or lack thereof). The docs encourage some tools to generate an id if none has been given.

Edit: This has been fixed now as well.

@CatsDeservePets CatsDeservePets added this to the r39 milestone Oct 31, 2025
@CatsDeservePets CatsDeservePets marked this pull request as ready for review November 1, 2025 02:24

@joelim-work joelim-work left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked over the code and it's generally fine, some minor suggestions from me.

Comment thread misc.go Outdated
Comment thread misc.go
Comment thread ui.go Outdated
@CatsDeservePets

Copy link
Copy Markdown
Collaborator Author

Please take one last look, @joelim-work.
Thank you very much for taking your time.
I am actually quite pleased with how this code has turned out.

@veltza

veltza commented Nov 1, 2025

Copy link
Copy Markdown
Contributor

@CatsDeservePets Hold on, I’ve found a bug. I’ll report it shortly.

@CatsDeservePets CatsDeservePets marked this pull request as draft November 1, 2025 14:18
@CatsDeservePets CatsDeservePets added the new Pull requests that add new behavior label Nov 1, 2025

@veltza veltza left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also analyzed the strange behavior of hyperlinks using my st fork and found that lf generates two hyperlinks from a single hyperlink and draws them on top of each other.

For example, if I print a hyperlink like this:

printf '\e]8;;http://example.com\e\\This is link\e]8;;\e\\\n'

lf generates two hyperlinks:

  1. First, it prints the hyperlink as it should:
printf '\e[3;53H\e]8;;http://example.com\e\\This is link\e[3;105H\e[0m\e]8;;\e\\\n'
  1. But then for some reason it prints another hyperlink on top of it and only renders the words in the link text because it skips spaces using escape sequences:
printf '\e[3;53H\e]8;;http://example.com\e\\This\e[3;58His\e[3;61Hlink\e[3;105H\e[0m\e]8;;\e\\\n'

This second hyperlink is what's causing the strange behavior. So if we could get rid of it, we wouldn't need to create any fallback ids in the first place.

Comment thread termseq.go Outdated
@CatsDeservePets

Copy link
Copy Markdown
Collaborator Author

I also analyzed the strange behavior of hyperlinks using my st fork and found that lf generates two hyperlinks from a single hyperlink and draws them on top of each other.

For example, if I print a hyperlink like this:

printf '\e]8;;http://example.com\e\\This is link\e]8;;\e\\\n'

lf generates two hyperlinks:

  1. First, it prints the hyperlink as it should:
printf '\e[3;53H\e]8;;http://example.com\e\\This is link\e[3;105H\e[0m\e]8;;\e\\\n'
  1. But then for some reason it prints another hyperlink on top of it and only renders the word because it skips spaces with escape sequences:
printf '\e[3;53H\e]8;;http://example.com\e\\This\e[3;58His\e[3;61Hlink\e[3;105H\e[0m\e]8;;\e\\\n'

This second hyperlink is what's causing the strange behavior. So if we could get rid of it, we wouldn't need to create any fallback ids in the first place.

Oh, interesting.

@veltza

veltza commented Nov 1, 2025

Copy link
Copy Markdown
Contributor

Oh, interesting.

Indeed. It would be interesting to know what causes it and why. My concern is that, without knowing its purpose, it might cause additional side effects.

Edit: I found out that the behavior comes from tcell and can be reproduced with the hyperlink.go demo. I also found out that it was gdamore/tcell@d17cf8b that broke the behavior of hyperlinks. But since I am not familiar with tcell, I don't understand why it causes this regression. So I guess no one uses hyperlinks because no one has noticed this yet.

@CatsDeservePets

CatsDeservePets commented Nov 1, 2025

Copy link
Copy Markdown
Collaborator Author

Oh, interesting.

Indeed. It would be interesting to know what causes it and why. My concern is that, without knowing its purpose, it might cause additional side effects.

Edit: I found out that the behavior comes from tcell and can be reproduced with the hyperlink.go demo. I also found out that it was gdamore/tcell@d17cf8b that broke the behavior of hyperlinks. But since I am not familiar with tcell, I don't understand why it causes this regression. So I guess no one uses hyperlinks because no one has noticed this yet.

And once again thank you for your research. So this is outside the scope of this PR. I will update the custom id generation and will mark this as ready for review, as I don't think this will causes any trouble.

It seems to me that there might be a problem with tcell incorrectly diffing the screen state to determine which parts to update.

@CatsDeservePets CatsDeservePets marked this pull request as ready for review November 1, 2025 21:43
Comment thread misc.go Outdated
@joelim-work

Copy link
Copy Markdown
Collaborator

Regarding the id parameter, I am just wondering but is there a need to autogenerate one if it is not provided? The aim here is to simply convert existing OSC8 sequences to Tcell style objects, and that external applications that generate these sequences should be responsible for filling in the ID if it is needed.

@CatsDeservePets

Copy link
Copy Markdown
Collaborator Author

Regarding the id parameter, I am just wondering but is there a need to autogenerate one if it is not provided? The aim here is to simply convert existing OSC8 sequences to Tcell style objects, and that external applications that generate these sequences should be responsible for filling in the ID if it is needed.

Without the id, it's up to the terminal emulator to decide which links belong together (some are smarter than others). I do agree, ideally the ID is provided by the external application, but that isn't always the case (I'd imagine users writing some configuration to output clickable links don't add it either).
There is an entire section about this in the gist.

Quoting from there:

Complex apps that manage the full screen and wish to explicitly linkify URIs, such as viewers or editors, should assign explicit ids that identify that particular link, so that it keeps being underlined together even across a linebreak, across another pane or window of the app's UI, and even across crazily optimized screen updates (e.g. when it repaints only a part of an anchor text). Such an id might perhaps be the file offset, or the (row, column) tuple where the hyperlink starts. Apps that support multiple windows, such as the imaginary text editor with that screenshot above, should add the ID of the window to the link's id too so that it does not conflict with the same target URI appearing in another window.

Complex apps that display data that might itself contain OSC 8 hyperlinks (such as terminal multiplexers, less -R) should do the following: If the encountered OSC 8 hyperlink already has an id, they should prefix it with some static string, or if multiple windows/panes are supported by the app, a prefix that's unique to that window/pane to prevent conflict with other windows/panes. If the encountered OSC 8 hyperlink does not have an id, they should automatically create one so that they can still have multiple windows/panes and can still crazily partially update the screen and keep it as a semantically single hyperlink towards the host emulator (remember the difference in VTE and iTerm2 when no id is set which becomes relevant here, so it should be avoided). This id should be taken from a namespace that cannot conflict with a mangled explicit id. It's probably much easier to implement VTE's approach here: assign a new id (maybe a sequential integer) whenever an OSC 8 with an URI but no id is encountered. This way there's absolutely no need to maintain any internal pool of the active hyperlink ids or anything like that, it's just a trivial mapping each time an OSC 8 is encountered in the data that needs to be displayed.

@veltza

veltza commented Nov 3, 2025

Copy link
Copy Markdown
Contributor

Regarding the id parameter, I am just wondering but is there a need to autogenerate one if it is not provided?

@joelim-work The reason has already been explained in this thread. So feel free to fix tcell so we don't have to use this workaround.

@CatsDeservePets

Copy link
Copy Markdown
Collaborator Author

I think at this point, this PR can be merged.

@CatsDeservePets CatsDeservePets merged commit e45412d into gokcehan:master Nov 3, 2025
32 checks passed
@CatsDeservePets CatsDeservePets deleted the OSC branch November 4, 2025 07:32
@joelim-work

joelim-work commented Nov 6, 2025

Copy link
Copy Markdown
Collaborator

Sorry, I was busy the last few days so I didn't have a chance to respond.

Regarding the double hyperlink issue in Tcell, I can reproduce it with the hyperlink.go demo too. I think what is happening is that the space character is a magic value used in Tcell to indicate an 'empty' cell. This means that clearing the screen and then printing foo bar will result in only cells containing visible characters being marked as dirty and then sent as updates to the terminal, creating the second hyperlink with only foo and bar.

But this is just speculation on my part, it's probably worth raising an issue in Tcell since it would be nice to not have this kind of workaround. The code is fine for now though, thanks for working on the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new Pull requests that add new behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support OSC 8 escape sequences for URLs in preview

3 participants