[Feature Request] Native Tensor Typing API

Welcome, dear bear friends, to the possible unveiling of @beartype's new built-in tensor typing API. It doesn't exist yet – *but it could*. And daydreaming is half the battle. This is all I learned from my childhood.

Currently, @beartype just farms out all responsibility for typing tensors to third-party packages like @patrick-kidger's wondrous [`jaxtyping`](https://github.com/patrick-kidger/jaxtyping). That's fine, of course. That works. Sure. No problem-o. We're all BFFLs here.

But third-party dependencies get awkward after awhile. Because it's not simply `jaxtyping`, is it? It never ends at a single dependency, does it? It's a brutal gauntlet of third-party packages like [pandera](https://github.com/unionai-oss/pandera) and [deal](https://github.com/life4/deal) and the twenty-thousand pound QA gorilla Pydantic and the list just goes on and on. All of these packages purport to work well with one another, but so infrequently do. There's *always* friction at the interface between two packages – let alone `n` packages whose intersection is usually the empty set.  Which brings us to...

## Pydantic: The Twenty-Thousand Pound QA Gorilla

LangGraph + PydanticAI now dominates the LLM space. We're probably all aware of that by now. LangChain? Old hat already. AutoGen? New hat but the hat is painful. CrewAI? New hat mostly only for rapid prototyping. Which means that Pydantic effectively dominates the Python space. And I ask myself:

> "How did this wondrous magic come to be? How did Pydantic rise above its competitors to so thoroughly dominate the power law distribution for Python packages?"

There are many answers – but the simplest is just that Pydantic literally does everything. Pydantic users don't have to seek outside Pydantic. Want JSON? Pydantic's got that already. Schema inference? Pydantic. Type casting? Pydantic. Data ingestion? Pydantic. And so on *ad nauseum.*

Pydantic's a monolithic mecha-kaiju with batteries included. Pydantic is the [Systemd](https://en.wikipedia.org/wiki/Systemd) of the Python world. It does everything you think it does and everything else you didn't think it could possibly do. How did this feature request become an advertisement for Pydantic!?!? 😮‍💨 

@beartype will never be monolithic in the way that Pydantic is monolithic. @beartype generally prefers the UNIX philosophy of: "Do one thing and do that thing well." But UNIX philosophy is a gradated spectrum of possibility. There's no practical justification for @beartype to literally *just* do one thing and only one thing.

@beartype can do many things and still be @beartype. Which leads us to...

## `beartype.hint`: A New Subpackage for a New Millenium

Gods! What a lame one-liner! Let's never use that slogon in anything users will see. 😂 

`beartype.hint` will be a new public @beartype API. It's gonna be great! `beartype.hint` will define PEP-compliant and mypy-friendly type hint factories that are generically usable by *any* runtime type-checker – Pydantic, `typeguard`, or otherwise.

`beartype.hint` type factories will include:

* `beartype.hint.Tensor[...]` a type hint factory for type-checking tensors defined by arbitrary third-party packages – including tensors defined by PyTorch, JAX, NumPy, SciPy, and so on. `beartype.hint.Tensor[...]` type hints are subscripted by tensor types, dtypes, ndims, and/or shapes. The syntax is the same old familiar Pythonic `typing` syntax we're all familiar with – only extended to tensors:
  * `beartype.hint.Tensor[numpy.ndarray, int, typing.Literal[3]]`, a three-dimensional NumPy array of integers (of any size).
  * `beartype.hint.Tensor[torch.Tensor, torch.float, tuple[typing.Literal[2560, 1440]]]`, a two-dimensional PyTorch tensor of floats (of any size) and the exact shape `2560 x 1440`.

The *exact* signature of `beartype.hint.Tensor` is a bit arduous to spec out, because Python doesn't even have the concept of a "type hint factory signature". Still, it looks something like a series of overloads subscripted by increasingly many child type hints:

1. The unsubscripted `beartype.hint.Tensor` attribute, matching *any* possible tensor from *any* third-party package. No idea if this is usable, but could be fun to support.
1. `beartype.hint.Tensor[{tensor_type}]`, matching *any* tensor of the single third-party type `{tensor_type}`.
1. `beartype.hint.Tensor[{tensor_type}, {tensor_dtype}]`, matching *any* tensor of the single third-party type `{tensor_type}` whose dtype is a subtype of `{tensor_dtype}`.
1. `beartype.hint.Tensor[{tensor_type}, typing.Literal[{tensor_ndim}]]`, matching *any* tensor of the single third-party type `{tensor_type}` whose number of dimensions is exactly `{tensor_ndim}`.
1. `beartype.hint.Tensor[{tensor_type}, tuple[typing.Literal[{tensor_dimension_1_size}], ..., typing.Literal[{tensor_dimension_N_size}]]]`, matching *any* tensor of the single third-party type `{tensor_type}` whose **shape** (i.e., size of the `N` dimensions of this tensor) is exactly `{tensor_dimension_1_size}` through `{tensor_dimension_N_size}`.
1. All possible permutations and combinations of the above. The only constraint is that the first child hint is *always* `{tensor_type}`.

Is implementing PEP-compliant and mypy-friendly type hint factories that are generically usable by *any* runtime type-checker even feasible, though? It's trivial. In fact, it's so trivial I [already specced out a working solution](https://github.com/beartype/beartype/issues/522#issuecomment-2846512265) over at #522. No problem-o. Even as I said that, though, my face was sweating. 😰 🥵 

Since `beartype.hint.Tensor[...]` type hints are the most succinct description of tensors, `beartype.hint.Tensor[...]` type hints are what most users are likely to use as actual type hints in end user apps. Under the hood, though, @beartype will reduce `beartype.hint.Tensor[...]` type hints to equivalent...

## @beartype Tensor Validators: A New Victor Emerges from the Rubble of the @beartype API

This feature request sure got long fast, didn't I? We're exhausted – and so are you. So, let's just finish up by exhibiting a few new `beartype.vale` validators unique to typing tensors. Users will be welcome to use these longer-winded public validators, even though nobody wants to:

* `typing.Annotated[{tensor_type}, IsTensorDtype[{tensor_dtype}]]`, semantically equivalent to the more compact form `beartype.hint.Tensor[{tensor_type}, {tensor_dtype}]` outlined above.
* `typing.Annotated[{tensor_type}, IsTensorNdim[{tensor_ndim}]]`, semantically equivalent to the more compact form `beartype.hint.Tensor[{tensor_type}, typing.Literal[{tensor_ndim}]]` outlined above.
* `typing.Annotated[{tensor_type}, IsTensorShape[{tensor_shape}]]`, semantically equivalent to the honestly *far* more verbose form `beartype.hint.Tensor[{tensor_type}, tuple[typing.Literal[{tensor_dimension_1_size}], ..., typing.Literal[{tensor_dimension_N_size}]]]]` outlined above. Python makes PEP-compliant type hints use `typing.Literal` for literally (*...get it?*) every magic number in a type hint. Interestingly, this ensures that the @beartype tensor validator approach beats out the `beartype.hint.Tensor[...]` approach in terms of readability. *Whatevah!*

## In Conclusion, I Have a Tired Face

That's it. That's @beartype tensor type hints. A similar approach can be extended to Pandas and Polars dataframes by defining a new `beartype.hint.DataFrame[...]` type hint factory backed by corresponding new `beartype.vale` validators.

Is any of this magic feasible? **Absolutely.** In fact, not only is this magic feasible, but this magic is trivially feasible. I should have done it years ago. So why didn't I?

Laziness. Actually, it's even worse than laziness. It's foolish ideology. I foolishly believed a bit too zealously in the UNIX philosophy. I wanted to help co-create a rich and diverse ecosystem of small little Python packages that each worked together to form a much larger and even more complete [holonomy](https://en.wikipedia.org/wiki/Holonomy) of vibrant software [holons](https://en.wikipedia.org/wiki/Holon_(philosophy)), all harmoniously working in concert for the good of all.

In other words, I was dumb. I should have just done what Pydantic did, which was to do everything and do everything well. Why farm essential type-checking work out to sibling third-party packages when @beartype could just do all of that work itself, right? Sure, it's more work for *me* – but it's a lot less work for *you*, the user. And *you*, the user, are most of what matters here.

`beartype.hint.Tensor[...]`: because users matter more than @leycec's sanity. 🥲 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature Request] Native Tensor Typing API #543

Pydantic: The Twenty-Thousand Pound QA Gorilla

`beartype.hint`: A New Subpackage for a New Millenium

@beartype Tensor Validators: A New Victor Emerges from the Rubble of the @beartype API

In Conclusion, I Have a Tired Face

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature Request] Native Tensor Typing API #543

Description

Pydantic: The Twenty-Thousand Pound QA Gorilla

beartype.hint: A New Subpackage for a New Millenium

@beartype Tensor Validators: A New Victor Emerges from the Rubble of the @beartype API

In Conclusion, I Have a Tired Face

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`beartype.hint`: A New Subpackage for a New Millenium