Currently, SQL databases have a well-defined LIKE operator with % (multi-character) and _ (single-character) wildcards, often extended with case-insensitive options and escape characters.
In NoSQL databases, the situation is inconsistent:
- Some have no equivalent to LIKE.
- Some use regex-based matching (often tied to the database’s programming language, e.g., Java regex in some APIs).
- Even when a LIKE-style operator exists, the syntax and semantics can vary.
As a Java/NoSQL developer, I would love to have a typical behavior for LIKE that's consistent with >, <, and so on.
Problem
Jakarta Query aims to provide a unified query language usable across SQL and NoSQL providers. Without a defined LIKE-style syntax:
- Developers must write different queries for different backends.
- Cross-store queries lose portability.
- Provider-specific behavior can cause unexpected differences.
Posgresql
SELECT * FROM customers WHERE name LIKE 'John%';
MongoDB (NoSQL) – Regex-based
db.customers.find({ name: { $regex: "^John", $options: "i" } });
Elasticsearch (NoSQL) – Wildcard query
{
"query": {
"wildcard": {
"name": "John*"
}
}
}
* matches zero or more characters.
? matches a single character.
Cassandra Limite supported
SELECT * FROM customers WHERE name LIKE 'John%';
- Only supports
% at the end of the pattern (prefix%).
- No
_ support.
⚠️ You can only use LIKE on columns that have a secondary index (or are indexed via SASI indexes in newer Cassandra versions). And some databases will not support this term.
Questions for Discussion:
-
1 Should Jakarta Query define a standardized LIKE syntax (e.g., % and _ wildcards) for cross-store compatibility?
- I would be fine to define a minimum behavior.
-
2 Should pattern translation be handled by:
- (a) The database vendor, or
- (b) The Jakarta Query/Jakarta Data provider (mapping
%/_ to vendor-specific syntax)?
-
3 Should the spec define matching rules explicitly for consistency?
Currently, SQL databases have a well-defined LIKE operator with % (multi-character) and _ (single-character) wildcards, often extended with case-insensitive options and escape characters.
In NoSQL databases, the situation is inconsistent:
As a Java/NoSQL developer, I would love to have a typical behavior for
LIKEthat's consistent with>,<, and so on.Problem
Jakarta Query aims to provide a unified query language usable across SQL and NoSQL providers. Without a defined
LIKE-stylesyntax:Posgresql
%matches zero or more characters._matches exactly one character.MongoDB (NoSQL) – Regex-based
Elasticsearch (NoSQL) – Wildcard query
{ "query": { "wildcard": { "name": "John*" } } }*matches zero or more characters.?matches a single character.Cassandra Limite supported
%at the end of the pattern (prefix%)._support.Questions for Discussion:
1 Should Jakarta Query define a standardized
LIKEsyntax (e.g.,%and_wildcards) for cross-store compatibility?2 Should pattern translation be handled by:
%/_to vendor-specific syntax)?3 Should the spec define matching rules explicitly for consistency?