Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negation operator influences other filters (stops processing of filters after it) #145

Open
jrots opened this issue Aug 28, 2017 · 10 comments
Assignees

Comments

@jrots
Copy link

jrots commented Aug 28, 2017

Best explained with an example,
think the output of the two queries below should be the same.

127.0.0.1:6379> FT.CREATE myIdx SCHEMA test1 NUMERIC test2 TEXT
OK
127.0.0.1:6379> FT.ADD myIdx doc1 1.0 FIELDS test1 1 test2 1
OK
127.0.0.1:6379> FT.ADD myIdx doc2 1.0 FIELDS test1 2 test2 2
OK
127.0.0.1:6379> FT.ADD myIdx doc3 1.0 FIELDS test1 3 test2 3
OK
127.0.0.1:6379>
127.0.0.1:6379> FT.ADD myIdx doc4 1.0 FIELDS test1 4 test2 4
OK
127.0.0.1:6379> FT.ADD myIdx doc5 1.0 FIELDS test1 5 test2 5
OK
127.0.0.1:6379> FT.SEARCH myIdx "@test1:[1 3] -@test2:(2)"
1) (integer) 2
2) "doc3"
3) 1) "test1"
   2) "3"
   3) "test2"
   4) "3"
4) "doc1"
5) 1) "test1"
   2) "1"
   3) "test2"
   4) "1"
127.0.0.1:6379> FT.SEARCH myIdx "-@test2:(2) @test1:[1 3]"
1) (integer) 3
2) "doc5"
3) 1) "test1"
   2) "5"
   3) "test2"
   4) "5"
4) "doc4"
5) 1) "test1"
   2) "4"
   3) "test2"
   4) "4"
6) "doc1"
7) 1) "test1"
   2) "1"
   3) "test2"
   4) "1"
@dvirsky
Copy link
Contributor

dvirsky commented Aug 28, 2017

Interesting, thanks! It's a parsing precedence ambiguity bug. Look at the parse trees using explain and it becomes clear:

127.0.0.1:6379> FT.EXPLAIN  myIdx "@test1:[1 3] -@test2:(2)"
INTERSECT {
  NUMERIC {1.000000 <= @test1 <= 3.000000}
  NOT{
    @test2:2
  }
}

127.0.0.1:6379> FT.EXPLAIN myIdx "-@test2:(2) @test1:[1 3]"
INTERSECT {
  <WILDCARD>}
  NOT{
    @test2:INTERSECT {
      @test2:2
      NUMERIC {1.000000 <= @test1 <= 3.000000}
    }
  }
}

Basically the second one is parsed as though it was -(@test2:2 @test1:[1 3]). Regardless, the result of that query should be doc5, doc4, doc3, doc1 and not the result we got. So there's actually a bug here besides the ambiguity bug.

@dvirsky
Copy link
Contributor

dvirsky commented Aug 28, 2017

@jrots as for the query parsing thing, I'm not sure this necessarily constitutes a bug, I just think the precedence should be documented better, after all you can express the query as "(-@test2:2) (@test1:[1 3])" and it will have the desired meaning. Anyway it's not the big issue here. There is however a problem with the not operator not working correctly, that is more urgent to fix. This is a very good find!

@jrots
Copy link
Author

jrots commented Aug 29, 2017

Just had a bit similar issue without using the negation operator -
Would think that both queries below should result in the same output which is not the case at the moment.
anyways putting normal term queries now at the end of range queries and negation queries at the very end to avoid weird parsing issues. :)

127.0.0.1:6379> 
FT.EXPLAIN testjrots "@bd:[19900101 19960930] @fmin:[-inf 33] @fmax:[33 inf] @ge:2"
100 results

INTERSECT {
	NUMERIC {19900101.000000 <= @bd <= 19960930.000000}
	NUMERIC {-inf <= @fmin <= 33.000000}
	NUMERIC {33.000000 <= @fmax <= inf}
	@ge:2
}

127.0.0.1:6379> 
FT.explain testjrots "@ge:2 @bd:[19900101 19960930] @fmin:[-inf 33] @fmax:[33 inf]"
0 results: 

@ge:INTERSECT {
	@ge:INTERSECT {
		@ge:INTERSECT {
			@ge:2
				NUMERIC {19900101.000000 <= @bd <= 19960930.000000}
		}
		NUMERIC {-inf <= @fmin <= 33.000000}
	}
	NUMERIC {33.000000 <= @fmax <= inf}
}

@dvirsky
Copy link
Contributor

dvirsky commented Aug 29, 2017

yeah, it shouldn't be hard to fix. It's just a matter of rule precedence in the parser. your intuition is right.

@jrots
Copy link
Author

jrots commented Aug 29, 2017

this also is smth weird:

127.0.0.1:6379> ft.search testjrots "@bd:[19900101 19960930] @fmin:[-inf 33] @fmax:[33 inf] @fgen:(3|1) @co:be" limit 0 1
(error) Syntax error at offset 71 near 'be'
127.0.0.1:6379> ft.search testjrots "@bd:[19900101 19960930] @fmin:[-inf 33] @fmax:[33 inf] @fgen:(3|1) @co:nl" limit 0 1
1) (integer) 0
127.0.0.1:6379> ft.search testjrots "@bd:[19900101 19960930] @fmin:[-inf 33] @fmax:[33 inf] @fgen:(3|1) @co:be -@uid:768802" limit 0 1
1) (integer) 1764
..

@dvirsky
Copy link
Contributor

dvirsky commented Aug 29, 2017

that's because "be" is a stopword.

@jrots
Copy link
Author

jrots commented Aug 29, 2017

alright got it :)

@dvirsky
Copy link
Contributor

dvirsky commented Aug 29, 2017

actually it shouldn't be a syntax error but rather quietly ignored. I'll add that to the parser fixes. You can disable stopwords completely if you want BTW.

@dvirsky
Copy link
Contributor

dvirsky commented Sep 28, 2017

The parser precedence issue still exists, but I've fixed the NOT iterator to avoid all the problems with its position in the query, and now both results are the same:

127.0.0.1:6379> FT.SEARCH myIdx "@test1:[1 3] -@test2:(2)"
1) (integer) 2
2) "doc3"
3) 1) "test1"
   2) "3"
   3) "test2"
   4) "3"
4) "doc1"
5) 1) "test1"
   2) "1"
   3) "test2"
   4) "1"
127.0.0.1:6379> FT.search myIdx "(-@test2:2) @test1:[1 3]"
1) (integer) 2
2) "doc3"
3) 1) "test1"
   2) "3"
   3) "test2"
   4) "3"
4) "doc1"
5) 1) "test1"
   2) "1"
   3) "test2"
   4) "1"

@dvirsky
Copy link
Contributor

dvirsky commented Sep 28, 2017

I'm keeping this issue open as a reminder for the parser issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants