Skip to content

Define LIKE-style pattern matching syntax for SQL and NoSQL in Jakarta Query #8

@otaviojava

Description

@otaviojava

Currently, SQL databases have a well-defined LIKE operator with % (multi-character) and _ (single-character) wildcards, often extended with case-insensitive options and escape characters.

In NoSQL databases, the situation is inconsistent:

  • Some have no equivalent to LIKE.
  • Some use regex-based matching (often tied to the database’s programming language, e.g., Java regex in some APIs).
  • Even when a LIKE-style operator exists, the syntax and semantics can vary.

As a Java/NoSQL developer, I would love to have a typical behavior for LIKE that's consistent with >, <, and so on.

Problem

Jakarta Query aims to provide a unified query language usable across SQL and NoSQL providers. Without a defined LIKE-style syntax:

  • Developers must write different queries for different backends.
  • Cross-store queries lose portability.
  • Provider-specific behavior can cause unexpected differences.

Posgresql

SELECT * FROM customers WHERE name LIKE 'John%';
  • % matches zero or more characters.

  • _ matches exactly one character.

MongoDB (NoSQL) – Regex-based

db.customers.find({ name: { $regex: "^John", $options: "i" } });

Elasticsearch (NoSQL) – Wildcard query

{
  "query": {
    "wildcard": {
      "name": "John*"
    }
  }
}
  • * matches zero or more characters.
  • ? matches a single character.

Cassandra Limite supported

SELECT * FROM customers WHERE name LIKE 'John%';
  • Only supports % at the end of the pattern (prefix%).
  • No _ support.

⚠️ You can only use LIKE on columns that have a secondary index (or are indexed via SASI indexes in newer Cassandra versions). And some databases will not support this term.

Questions for Discussion:

  • 1 Should Jakarta Query define a standardized LIKE syntax (e.g., % and _ wildcards) for cross-store compatibility?

    • I would be fine to define a minimum behavior.
  • 2 Should pattern translation be handled by:

    • (a) The database vendor, or
    • (b) The Jakarta Query/Jakarta Data provider (mapping %/_ to vendor-specific syntax)?
  • 3 Should the spec define matching rules explicitly for consistency?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions