You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While PR #1878 gives an overall performance improvement, there are some input parameter combinations where it leads to a small slowdown. Details: #1878 (comment)
The reason is that the chunk size is chosen to be 20000 before, and becomes 4320 after this PR. So five kernels are run instead of just one.
It's easy to fix this by setting minimal chunk size.
...
More aggressively, it's possible that we remove chunking.
The text was updated successfully, but these errors were encountered:
When chunking was introduced, we assumed that chunking large batch size cases into smaller (but still reasonably large) chunks will not have a significant effect on the runtime. It apparently does not hold for this case where len is fall.
Let's list typical vector search use cases, and see how changing chunk size / removing it would impact memory usage.
While PR #1878 gives an overall performance improvement, there are some input parameter combinations where it leads to a small slowdown. Details: #1878 (comment)
The text was updated successfully, but these errors were encountered: