Skip to content

Conversation

@shiran2
Copy link

@shiran2 shiran2 commented Sep 23, 2025

This PR extends forum search with support for filtering by forum category.
It prepares the ground for a follow-up PR in lila that will fix lichess-org/lila#18240

The change is backward compatible and safe to deploy.
Note: Elasticsearch must be reindexed for the forum index to populate the new ca (category) field before category filtering will work.
Once reindexing is complete, the lila changes that pass categIds in search requests should be deployed.

Copy link
Collaborator

@ornicar ornicar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, when searching for posts on behalf of a user, lila will send the list of forum categIds that the user is allowed to see.

That would be the 5 official categs on https://lichess.org/forum.
And the (up to 150) forums of the teams that the user has joined, where the user has the permission to see the team forum:

Image

And all the teams that have allowed their forum to be public:

Image

At the moment that's more than 6,000 teams, and it will only increase.

I don't think we want to send 6,000 categIds to each search query.

doc.getNested(F.createdAt).flatMap(_.asInstant).map(_.toEpochMilli),
doc.getString(F.userId).some
).mapN(ForumSource.apply(_, _, topicId, _, _, _))
doc.getString(F.userId),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change means that we no longer index (historical) anonymous forum posts.

That's fine by me.

doc.getString(F.userId),
doc.getString(F.categId)
).mapN { (body, topic, troll, date, author, categ) =>
ForumSource.apply(body, topic, topicId, troll, date, categ, Some(author))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ForumSource.author should therefore be String and not Option[String]

@ornicar
Copy link
Collaborator

ornicar commented Sep 30, 2025

And this is why we currently show hidden posts for search matches. Is it ideal? No. But getting it done better is likely to require a lot more code and complexity than it's worth.

Happy to be proven wrong if you have a better idea of course.

@shiran2
Copy link
Author

shiran2 commented Sep 30, 2025

@ornicar The problem is that currently all public forums have unique categIds. If we give public forums the same "public" categId, we can call lila-search with [public, team1, … teamN]. Instead of 6,000+ values, that caps at 155 (and usually far less, since most users have <150 teams). We could either reuse categIds for this or add an access column with "public" for public forums and the team ID for the rest.
What do you think?

@ornicar
Copy link
Collaborator

ornicar commented Sep 30, 2025

We could either reuse categIds for this or add an access column with "public" for public forums

Let's not touch the categIds, they're quite necessary.

We could however denormalize the team forum access setting, theoretically. It means that every time a team leader changes the team setting, we'd have to tell lila-search so that it can update all the forum post of the team.

Then we "only" need to add to the search query the list of categIds of teams that the user belongs to, after checking for each of them that they allow their members to view the team forum.

It's a lot of complexity, over 2 code repositories, and it's additional data denormalization over 2 distant databases (mongo & elasticsearch) and it's a perf hit on each search query (especially to read all settings of all teams of the user).

For a feature that, while legitimate, no-one has asked for in years.

This falls below the arbitrary ratio of usefulness/cost that I set for Lichess at the moment.

@shiran2
Copy link
Author

shiran2 commented Sep 30, 2025

@ornicar Ok, so you’re not interested in fixing this issue?
By the way, regardless of the security aspect, it would be a better user experience to scroll without the hidden results noise.

@ornicar
Copy link
Collaborator

ornicar commented Oct 4, 2025

I'm interested in fixing every issue, as long as doing so doesn't add so much complexity that the overall project is worse for it.

The direction this is going is not promising.

@shiran2
Copy link
Author

shiran2 commented Oct 4, 2025

Have a better suggestion for a fix?

@ornicar
Copy link
Collaborator

ornicar commented Oct 4, 2025

I can't think of one that wouldn't be worse than the current state of things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Forum search pagination allows guessing if a word is present in a private team forum

2 participants