Skip to content

InferencePackingConfig fails when input contains empty strings #315

@tannonk

Description

@tannonk

Description

When using InferencePackingConfig, the model crashes if the input list contains an empty string ("").

Standard inference (without packing) handles empty strings gracefully by returning an empty list of entities, but the packing logic triggers a ValueError.

Steps to Reproduce

The following script demonstrates that the model works as expected during standard inference but fails as soon as packing is enabled.

from gliner import GLiNER, InferencePackingConfig

model = GLiNER.from_pretrained("urchade/gliner_medium-v2.1") # or "knowledgator/gliner-decoder-base-v1.0," "knowledgator/gliner-bi-small-v1.0"

texts = ["Email CEO to approve budget", ""]
labels = ["person", "organization", "action"]

# 1. Standard inference works fine
print("Running standard inference...")
predictions = model.inference(texts, labels, batch_size=16)
print(f"Standard results: {predictions}") 
# Expected: [[{'start': 6, 'end': 9, 'text': 'CEO', 'label': 'person', ...}], []]

# 2. Inference with packing fails
print("\nRunning packed inference...")
packing_cfg = InferencePackingConfig(
    max_length=512,
    sep_token_id=model.data_processor.transformer_tokenizer.sep_token_id,
    streams_per_batch=1,
)

model.configure_inference_packing(packing_cfg)
predictions_packed = model.inference(texts, labels, batch_size=16)

Expected Behavior

The packed inference should return an empty list for the empty string entry, matching the behaviour of standard inference:
[[{...}], []]

Actual Behavior

Traceback Snippet:

...
File "/home/user/.local/lib/python3.12/site-packages/flashdeberta/model.py", line 443, in forward
    raise ValueError("You have to specify either input_ids or inputs_embeds")
ValueError: You have to specify either input_ids or inputs_embeds

Models

I can confirm the problem affects various models, e.g. urchade/gliner_medium-v2.1 knowledgator/gliner-decoder-base-v1.0, knowledgator/gliner-bi-small-v1.0.

However, knowledgator/gliner-x-base is not affected. Instead, empty strings lead to a different error, which I'll make another issue for.

Environment details


Potential Workaround

Users can filter out empty strings before passing them to the model, but this requires manual management of indices to re-align results later. An internal fix to unify the behaviour between standard and packed inference would be ideal.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions