Skip to content

Bugfix/non integer taxon keys#203

Closed
jhnwllr wants to merge 7 commits into
devfrom
bugfix/non-integer-taxon-keys
Closed

Bugfix/non integer taxon keys#203
jhnwllr wants to merge 7 commits into
devfrom
bugfix/non-integer-taxon-keys

Conversation

@jhnwllr

@jhnwllr jhnwllr commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

Fix non-integer taxon keys in download request

Fixes #179

jhnwllr and others added 6 commits November 14, 2025 14:49
* Feature/grscicoll (#173)

Adding support for GRSciColl institution and collection search. Updating documentation and vcr fixtures.

* Adding support for species match v2 #176 (#177)

* preparing for new version 0.6.6

* updating ver

* fixing tests and fresh cassettes

* fixing build
- Add type conversion for integer keys (TAXON_KEY, YEAR, MONTH, etc.)
- Add type conversion for numeric keys (DEPTH, ELEVATION, coordinates, etc.)
- Modify _parse_args function to convert string values to int/float
- Handle both simple predicates and 'in' predicates with value lists
- Add comprehensive tests for integer and numeric conversions
- Update existing tests to expect numeric values
- Remove numeric_keys set and float conversion logic
- Only convert taxonomic backbone keys (TAXON_KEY, SPECIES_KEY, etc.) and YEAR/MONTH to integers
- UUID keys (DATASET_KEY, NETWORK_KEY, etc.) remain as strings
- Add tests for datasetKey to verify UUIDs are not converted
- Add test for mixed taxonomic and UUID keys
- 19 tests passing

This comment was marked as outdated.

- Add _convert_to_int() helper function to handle edge cases
- Handle float-like strings (e.g., '5785887.0') by converting via float first
- Raise clear ValueError for non-numeric values instead of silently failing
- Log warning when fractional parts are truncated
- Apply robust conversion to both scalar and 'in' predicates
- Add 3 new tests: float string conversion, in-predicate float conversion, invalid value error
- All 22 tests passing

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +26 to +30
# Taxonomic keys from GBIF's backbone taxonomy that should be integers
# Note: Other keys like DATASET_KEY, NETWORK_KEY are UUIDs and should remain strings
taxonomic_integer_keys = {
"TAXON_KEY", "SPECIES_KEY", "GENUS_KEY", "FAMILY_KEY",
"ORDER_KEY", "CLASS_KEY", "PHYLUM_KEY", "KINGDOM_KEY",

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments/variable name imply these are strictly taxonomic backbone keys, but the set also includes YEAR and MONTH. To avoid confusion for future maintenance, either rename this set to something like integer_predicate_keys (or similar) or adjust the comment to match its actual contents.

Suggested change
# Taxonomic keys from GBIF's backbone taxonomy that should be integers
# Note: Other keys like DATASET_KEY, NETWORK_KEY are UUIDs and should remain strings
taxonomic_integer_keys = {
"TAXON_KEY", "SPECIES_KEY", "GENUS_KEY", "FAMILY_KEY",
"ORDER_KEY", "CLASS_KEY", "PHYLUM_KEY", "KINGDOM_KEY",
# Includes taxonomic backbone keys (e.g. TAXON_KEY, SPECIES_KEY) and other
# integer-valued predicates such as YEAR and MONTH. Other keys like
# DATASET_KEY, NETWORK_KEY are UUIDs and should remain strings.
taxonomic_integer_keys = {
"TAXON_KEY", "SPECIES_KEY", "GENUS_KEY", "FAMILY_KEY",
"ORDER_KEY", "CLASS_KEY", "PHYLUM_KEY", "KINGDOM_KEY",

Copilot uses AI. Check for mistakes.
try:
converted_values.append(_convert_to_int(v, key))
except ValueError as e:
raise ValueError(f"Error in 'in' predicate for {key}: {str(e)}")

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When re-raising conversion failures in the 'in' predicate branch, the original exception context is dropped. Using exception chaining (raise ... from e) would preserve the underlying cause/traceback and make debugging malformed lists easier.

Suggested change
raise ValueError(f"Error in 'in' predicate for {key}: {str(e)}")
raise ValueError(f"Error in 'in' predicate for {key}: {str(e)}") from e

Copilot uses AI. Check for mistakes.
)
self.assertIsInstance(payload["predicate"]["predicates"][0]["value"], int)

def test_depth_numeric_conversion(self):

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test name "test_depth_numeric_conversion" conflicts with its docstring/behavior (it asserts depth stays a string and is not converted). Consider renaming the test to reflect the intent (e.g., that non-integer numeric-like fields remain strings) to keep the suite self-explanatory.

Suggested change
def test_depth_numeric_conversion(self):
def test_depth_numeric_like_value_remains_string(self):

Copilot uses AI. Check for mistakes.
@jhnwllr

jhnwllr commented Mar 25, 2026

Copy link
Copy Markdown
Contributor Author

I think we have to think about another approach to handle downloads with checklistKey parameter.

@jhnwllr jhnwllr closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants