Skip to content

Conversation

@tjgreen42
Copy link
Contributor

@tjgreen42 tjgreen42 commented Nov 12, 2024

Description

This PR adds support for distance metrics besides cosine, beginning with L2. The primitives for this metric were already present (earlier versions of pgvectorscale used L2 rather than cosine), so this PR just adds the operator class-related plumbing. As discussed in the design doc (see references) we follow pgvector syntax.

Testing

  • Refactored some of the basic scaffolds to take a DistanceType parameter and added L2 versions of some of the callers. We could carry this out systemically and insist that all tests have variants covering all metrics, but this seems like overkill.
  • Confirmed manually via EXPLAIN PLAN that the correct index is still chosen.
  • Confirmed manually that test_upgrade succeeds (with generated sql schema, not included in this PR)

References

@tjgreen42 tjgreen42 requested a review from a team as a code owner November 12, 2024 19:05
@tjgreen42 tjgreen42 requested review from cevian and syvb November 12, 2024 19:06
@tjgreen42 tjgreen42 changed the title [Distance metrics M1] support for L2 (aka Euclidean) distance [Distance metrics M1] support for L2 distance Nov 12, 2024
Copy link
Collaborator

@cevian cevian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good but some comments

@tjgreen42 tjgreen42 requested a review from cevian November 13, 2024 02:39
Copy link
Collaborator

@cevian cevian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good except for the update script which I still have concerns about.

tjgreen42 and others added 3 commits November 13, 2024 16:46
pgrx sets CARGO_TARGET_DIR so we were previously sharing a target dir
between multiple versions, which caused havoc.

Now set explicitly.
tjgreen42 and others added 8 commits November 14, 2024 12:02
Co-authored-by: Matvey Arye <cevian@gmail.com>
Signed-off-by: tjgreen42 <tj@timescale.com>
Co-authored-by: Matvey Arye <cevian@gmail.com>
Signed-off-by: tjgreen42 <tj@timescale.com>
Signed-off-by: tjgreen42 <tj@timescale.com>
The cargo environment variable is needed for correct
sql generation with versioned so's.

Small fixes to migration scripts.

Related to work adding l2.
@tjgreen42 tjgreen42 requested review from cevian and syvb November 15, 2024 02:56
Copy link
Collaborator

@cevian cevian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may make sense to squash some commits

@tjgreen42 tjgreen42 merged commit 43e3f5e into main Nov 15, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants