Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Radix topk remove or improve chunking #1974

Open
tfeher opened this issue Nov 8, 2023 · 1 comment
Open

[FEA] Radix topk remove or improve chunking #1974

tfeher opened this issue Nov 8, 2023 · 1 comment
Assignees
Labels
feature request New feature or request Vector Search

Comments

@tfeher
Copy link
Contributor

tfeher commented Nov 8, 2023

While PR #1878 gives an overall performance improvement, there are some input parameter combinations where it leads to a small slowdown. Details: #1878 (comment)

The reason is that the chunk size is chosen to be 20000 before, and becomes 4320 after this PR. So five kernels are run instead of just one.

It's easy to fix this by setting minimal chunk size.
...
More aggressively, it's possible that we remove chunking.

@tfeher tfeher added the feature request New feature or request label Nov 8, 2023
@tfeher
Copy link
Contributor Author

tfeher commented Nov 8, 2023

When chunking was introduced, we assumed that chunking large batch size cases into smaller (but still reasonably large) chunks will not have a significant effect on the runtime. It apparently does not hold for this case where len is fall.

Let's list typical vector search use cases, and see how changing chunk size / removing it would impact memory usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Vector Search
Projects
Status: In Progress
Development

No branches or pull requests

2 participants