Skip to content

Conversation

SmallJoker
Copy link
Member

…ings

Prevously, the translation update script would match 'if not input:find("%S") then' because 'S"' is a valid function call expression. This commit addresses that by only matching single lines. Whereas that's a workaround, it does work well. --> Tested on luanti/builtin

This furthermore adds the option '--discard-empty' which omits empty translation strings from all files but template.txt.

…ings

Prevously, the translation update script would match 'if not input:find("%S") then' because 'S"' is
a valid function call expression. This commit addresses that by only matching single lines.
Whereas that's a workaround, it does work well. --> Tested on luanti/builtin

This furthermore adds the option '--discard-empty' which omits empty translation strings from
all files but template.txt.
@SmallJoker SmallJoker changed the title Translation script: Fix incorrect matches, allow discarding empty str… Translation script: Fix incorrect matches, allow discarding empty srings Sep 23, 2025
@appgurueu
Copy link
Contributor

Prevously, the translation update script would match if not input:find("%S") then because S" is a valid function call expression.

I am confused: S" is not a valid function call expression? Do you mean: If there is another string later, S")..." could be a valid function call expression?

Either way the problem is that we start matching inside a string, no? The only real way to combat that is to keep track of tokens (at least at a string / non-string granularity), then have a regex of (token)*pattern.

Is there ever a situation where we want to allow empty translation strings?


(Ultimately this is another problem caused by just throwing regex at this rather than a clean two step approach starting with tokenization... but probably not worth fixing?)

@SmallJoker
Copy link
Member Author

Do you mean: If there is another string later, S")..." could be a valid function call expression?

Yes, that. The code I wrote to fix that is not perfect, but at least gets rid of all false-positives within luanti/builtin.

Either way the problem is that we start matching inside a string, no? The only real way to combat that is to keep track of tokens (at least at a string / non-string granularity), then have a regex of (token)*pattern.

Yes, but I'm not going to implement that. Regex isn't the best choice here, but it's what we've got, and what works OK in most cases. Feel free to propose code changes.

Is there ever a situation where we want to allow empty translation strings?

Modders's choice to provide convenience for the translators.

@y5nw
Copy link
Contributor

y5nw commented Sep 24, 2025

This furthermore adds the option '--discard-empty' which omits empty translation strings from all files but template.txt.

Usecase? Empty strings in translation files allow looking for strings that are not yet translated and translating those directly (without cross-referencing the template or adding the missing entries manually).

Discarding empty strings brings hardly any gain to speak of in terms of disk space unless many strings are not translated. The argument on disk space/network transfer times is also somewhat paradoxical in itself: the "wasted" space is a result of the lack of translations; this should arguably be improved by actually providing translations for such strings, which has the opposite effect of increasing the size of translations files.

(Edit: Clarification)

@appgurueu
Copy link
Contributor

Oh! I had misunderstood "empty translation strings" to mean S(""). Should have taken a closer look at the code. Disregard my above comment regarding that then, it was related to my question as to which function call expression was being referred to...

@SmallJoker SmallJoker changed the title Translation script: Fix incorrect matches, allow discarding empty srings Translation script: Fix incorrect multiline matches Sep 27, 2025
@SmallJoker
Copy link
Member Author

Removed the --discard-empty option.

@y5nw
Copy link
Contributor

y5nw commented Sep 27, 2025

Oh! I had misunderstood "empty translation strings" to mean S("").

Actually there should also be a warning against S("") (perhaps in a separate PR or a warning in the engine1?). PO files use the empty source string for metadata, so people who use the empty source string in tr files and (later) convert them to PO will have quite a surprise at some point.

Footnotes

  1. Note that - assuming the client can trigger the server to send S("") with a PO-file-based translation - this does not leak any information, as the client is already given the entire PO file during media loading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants