Distinguish data from distance type in vector space by GuilhemN · Pull Request #484 · nmslib/nmslib

GuilhemN · 2021-06-09T07:32:29Z

Related to #482

GuilhemN · 2021-06-09T09:40:41Z

I migrated SpaceSparseJaccard using this in another branch (using ann-benchmarks with my PR erikbern/ann-benchmarks#239).

The results are quite impressive, before:

After:

searchivarius · 2021-06-10T22:09:15Z

Hi @GuilhemN I have reviewed the request. Yes, we can follow this path, but it is a rather substantial change of the C++ API so I need to discuss it with other people. And also if we are doing something like this, don't you think it's easier to have just a float distance instead? An extra template parameter, you know, it would be nice to get rid of dist_t then forever.

GuilhemN · 2021-06-14T06:11:43Z

Yes, we can follow this path, but it is a rather substantial change of the C++ API so I need to discuss it with other people.

Sure, no problem.

And also if we are doing something like this, don't you think it's easier to have just a float distance instead? An extra template parameter, you know, it would be nice to get rid of dist_t then forever.

Wouldn't that be a bit weird for hamming distance? And also maybe slower than ints?
But I can do it if you want, that would leave us with just one template parameter (that I called dist_uint_t in this proposal, but should probably be renamed to something like data_t).
And it would actually also break compatibility with the former python interface so it would probably require a new major version, right?

searchivarius · 2021-06-14T11:52:39Z

Float as a distance is going to be ok IMHO, but the compatibility will be broken, indeed. Let me send an email shortly. That needs to be discussed with more people.

…

On Mon, Jun 14, 2021, 2:11 AM Guilhem Niot ***@***.***> wrote: Yes, we can follow this path, but it is a rather substantial change of the C++ API so I need to discuss it with other people. Sure, no problem. And also if we are doing something like this, don't you think it's easier to have just a float distance instead? An extra template parameter, you know, it would be nice to get rid of dist_t then forever. Wouldn't that be a bit weird for hamming distance? And also maybe slower than ints? But I can do it if you want, that would leave us with just one template parameter (that I called dist_uint_t in this proposal, but should probably be renamed to something like data_t). And it would actually also break compatibility with the former python interface so it would probably require a new major version, right? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#484 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGJSMWTZ6VDKOF6MRAPPFLTSWMS3ANCNFSM46LOMERQ> .

searchivarius · 2021-06-22T17:05:43Z

Hi @GuilhemN after some consideration, I think I will likely go with your approach. I don't worry about float being inefficient, but using float instead of int will break compatibility. Please, give me some time, I will batch update the library (there are several important PRs) when I have a bit more time.

Thank you!

GuilhemN · 2021-06-23T22:26:10Z

Hi! Ok thank you, let me know when I should rebase/rework on this PR :)

searchivarius · 2021-06-28T11:39:00Z

Hi @GuilhemN I have to postpone things a bit, then I will update the library "en masse". I will let you know if something needs to be changed, but likely it's going to be ok.

GuilhemN mentioned this pull request Jun 9, 2021

Support input type different than the distance type #482

Open

GuilhemN force-pushed the INPUTTYPE branch 2 times, most recently from 7602572 to 31c73e7 Compare June 9, 2021 07:36

Distinguish data from distance type in vector space

452453a

GuilhemN force-pushed the INPUTTYPE branch from 31c73e7 to 452453a Compare June 9, 2021 09:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distinguish data from distance type in vector space#484

Distinguish data from distance type in vector space#484
GuilhemN wants to merge 1 commit into
nmslib:masterfrom
GuilhemN:INPUTTYPE

GuilhemN commented Jun 9, 2021

Uh oh!

GuilhemN commented Jun 9, 2021

Uh oh!

searchivarius commented Jun 10, 2021

Uh oh!

GuilhemN commented Jun 14, 2021

Uh oh!

searchivarius commented Jun 14, 2021 via email

Uh oh!

searchivarius commented Jun 22, 2021

Uh oh!

GuilhemN commented Jun 23, 2021

Uh oh!

searchivarius commented Jun 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GuilhemN commented Jun 9, 2021

Uh oh!

GuilhemN commented Jun 9, 2021

Uh oh!

searchivarius commented Jun 10, 2021

Uh oh!

GuilhemN commented Jun 14, 2021

Uh oh!

searchivarius commented Jun 14, 2021 via email

Uh oh!

searchivarius commented Jun 22, 2021

Uh oh!

GuilhemN commented Jun 23, 2021

Uh oh!

searchivarius commented Jun 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants