Infer item type of comprehensions by omkar-334 · Pull Request #7687 · cython/cython

omkar-334 · 2026-05-13T08:35:10Z

This PR fixes the issue #7684 where cython.typeof([(a, a+1) for a in x]) reports a bare list object even when the items have a known shape. and Adds tests for this in tests/run/type_inference_comprehensions.pyx.

ComprehensionNode.infer_type was a one-line stub that returned self.type. Now it looks at the body expression (self.append.expr for list/set, key_expr/value_expr for dict), infers each item's type, and calls specialize_here on the container. Falls back to the bare container whenever any item is unresolvable, so no path can regress.

After the change:

[(a, a + 1) for a in x]      # list[tuple object] object   (was: list object)
[42 for _ in range(3)]        # list[long] object
{str(i): 42 for i in range(2)}  # dict[str object,long] object

omkar-334 · 2026-05-13T08:36:08Z

Testing script:

#!/usr/bin/env bash
# Demo: comprehension item-type inference. Run from the repo root.

T=$(mktemp -d)

cat > $T/issue_example.pyx <<'EOF'
# cython: language_level=3, infer_types=True
from cython cimport typeof

def issue_example():
    cdef list[int] x = [1, 2, 3, 4]
    result = [(a, a + 1) for a in x]
    return result, typeof(result)
EOF

PYTHONPATH=$(pwd) /tmp/cy313venv/bin/python \
    -m Cython.Build.Cythonize -i $T/issue_example.pyx >/dev/null 2>&1

/tmp/cy313venv/bin/python - <<PYEOF
import sys
sys.path.insert(0, "$T")
from issue_example import issue_example
result, type_str = issue_example()
print(result)
print(type_str)
PYEOF

Output

…fer-type

omkar-334 · 2026-05-13T08:41:13Z

Do we need to add more tests for this? I removed a few from my original commit because i thought they were not needed.

omkar-334 · 2026-05-13T08:48:41Z

Also, to achieve the best case (getting list[tuple[int, int]]) -do I need to change the infer_type method of TupleNode, ListNode, SetNode and DictNode?

FOr ListNode - Cython/Compiler/ExprNodes.py:9251: - There is already a TODO: Infer non-object list arrays.

You need to change .infer_type() for the comprehension nodes in ExprNodes.py to infer the type of the appended items.

I think you meant the same thing with this right

scoder · 2026-05-13T09:33:12Z

do I need to change the infer_type method of TupleNode, ListNode, SetNode and DictNode?

That's already on its way in #7686

scoder · 2026-05-13T09:36:39Z

+        # DictComprehensionAppendNode is defined later in this module, so
+        # dispatch by attribute shape to avoid a forward reference.


Forward references inside of the same module are just fine (inside of methods), but here, I think we should rather use recursion and ask the append node to infer its type.

So in DictComprehensionAppendNode and ComprehensionAppendNode I can add the infer_type(env) method and then ComprehensionNode.infer_type can just collect these right?

omkar-334 · 2026-05-13T10:05:46Z

so with these changes im getting this

The last case - it ends at tuple, even though the tuple contains further items inside it
There seems to be some issue with PyTupleObject and CTupleType

omkar-334 · 2026-05-13T10:09:29Z

For [(a, a + 1) for a in xs] with xs: cdef list[int], typeof printslist[(int, long)] object. But the list actually stores Python tuples at runtime, not C tuples (comprehensions always box items). The displayed type is misleading.

  r = [(a, a + 1) for a in xs]
  print(typeof(r))      # list[(int, long)] object   <- claims CTuple items
  print(type(r[0]))     # <class 'tuple'>            <- actual: Python tuple

scoder · 2026-05-13T13:27:05Z

There seems to be some issue with PyTupleObject and CTupleType

Tuple items are not type-inferred yet, pretty much for that reason.

scoder · 2026-05-13T13:37:44Z

the list actually stores Python tuples at runtime, not C tuples

It's more concrete to infer the underlying C types rather than the Python types. A ctuple seems as good as it gets for that. Note that the item types only become relevant when requesting the items (and potentially unpacking them). They don't need to represent the actual storage used inside of the list/set/dict.

It'll be a tradeoff in the end, though. If we automatically unpack the tuple values into ctuples that are then passed back into some Python API as Python tuples, we end up wasting time and space converting back and forth for no good reason. But if the user code then unpacks the items into C variables and processes them as C values further on, we can have big wins. Overall, I would expect the latter case to be more relevant, so let's keep the ctuples until we find a problem with it.

omkar-334 · 2026-05-18T07:15:41Z

@scoder Agreed, keeping the ctuples. The implementation already does that. [(a, a+1) for a in x] infers list[(int, long)]... what do you think about the current implementation?

omkar-334 · 2026-05-18T09:35:25Z

@scoder there were 10 tests failing , so i added a fix related to sliced/nested objects,.... and now 5 are failing..

omkar-334 · 2026-05-20T13:04:10Z

@scoder any suggestions?

# Conflicts: # Cython/Compiler/ExprNodes.py

da-woods · 2026-06-06T19:51:53Z

and now 5 are failing..

Looking at this in hindsight, I think most of the failures were problems we were having with the master branch at the time. So hopefully it's better now.... We shall see.

omkar-334 · 2026-06-07T06:54:46Z

thanks @da-woods , looks like all tests have passed

scoder · 2026-06-11T04:21:10Z

+        # Bail early for container types that cannot be parameterised.
+        if not self.type.supports_container_type:
+            return self.type
+        # Resolve names in the comprehension's own scope. NameNode.infer_type
+        # caches its entry, so using the outer env here would steal the inner
+        # loop variable's binding (regression: list_comp_in_closure_T598).
+        body_env = self.expr_scope or env
+        # Propagate the iterator's item type into the loop target before
+        # inspecting the body. Without this, `[a for a in xs]` where
+        # xs: list[int] sees `a` as py_object because ControlFlowAnalysis
+        # has not yet typed the comprehension's loop variable.
+        self._propagate_loop_target_type(env)


It's difficult to spot the code between the long comments.

Suggested change

# Bail early for container types that cannot be parameterised.

if not self.type.supports_container_type:

return self.type

# Resolve names in the comprehension's own scope. NameNode.infer_type

# caches its entry, so using the outer env here would steal the inner

# loop variable's binding (regression: list_comp_in_closure_T598).

body_env = self.expr_scope or env

# Propagate the iterator's item type into the loop target before

# inspecting the body. Without this, `[a for a in xs]` where

# xs: list[int] sees `a` as py_object because ControlFlowAnalysis

# has not yet typed the comprehension's loop variable.

self._propagate_loop_target_type(env)

if not self.type.supports_container_type:

return self.type

body_env = self.expr_scope or env

I've removed the target type assignment. See my other comment.

scoder · 2026-06-11T04:34:43Z

+        item_types = []
+        for item_type in self.append.infer_item_types(body_env):
+            if (item_type is None
+                    or item_type is py_object_type
+                    or item_type.is_error
+                    or item_type.is_unspecified
+                    or item_type.is_fused):
+                return self.type
+            item_types.append(item_type)


I don't think unspecified types should ever originate from a type inference method, so we shouldn't handle them specially here. I also don't think the error type needs to be handled specially. It probably won't hurt to specialise the comprehension type with it.

Why are fused types an issue here?

If possible, I'd just do the following:

Suggested change

item_types = []

for item_type in self.append.infer_item_types(body_env):

if (item_type is None

or item_type is py_object_type

or item_type.is_error

or item_type.is_unspecified

or item_type.is_fused):

return self.type

item_types.append(item_type)

item_types = self.append.infer_item_types(body_env)

scoder · 2026-06-11T04:44:49Z

+        if (target is None or not getattr(target, 'is_name', False)
+                or target.entry is None):


It's a node, there should be an .is_name attribute.

Suggested change

if (target is None or not getattr(target, 'is_name', False)

or target.entry is None):

if target is None or not target.is_name or target.entry is None:

scoder · 2026-06-11T04:45:47Z

+        item_type = sequence_type.infer_iterator_type()
+        if (item_type is None
+                or item_type is py_object_type
+                or item_type.is_unspecified):
+            return


Suggested change

item_type = sequence_type.infer_iterator_type()

if (item_type is None

or item_type is py_object_type

or item_type.is_unspecified):

return

item_type = sequence_type.infer_iterator_type()

if item_type is None or item_type is py_object_type:

return

scoder · 2026-06-11T04:50:26Z

+        # Propagate the iterator's item type into the loop target before
+        # inspecting the body. Without this, `[a for a in xs]` where
+        # xs: list[int] sees `a` as py_object because ControlFlowAnalysis
+        # has not yet typed the comprehension's loop variable.
+        self._propagate_loop_target_type(env)


Hmmm. Looking at this now, I think this is the wrong place to assign a type because it drops a side-effect into .infer_type(). That type analysis step shouldn't modify the tree, it should really only analyse it. I think this part needs to go. I've removed the method call from the reordering above. We can do it in .analyse_types() later, maybe even rather in the for-loop node than here.

Although … why is this an issue at all? The item values that we collect in the comprehension are necessarily Python objects. Does it matter whether they are typed as anything but py_object_type ?

omkar-334 added 3 commits May 13, 2026 13:02

Infer item type of comprehensions

73ed4bb

add tests

2a3403e

Merge branch 'master' into infer-type

a30135f

omkar-334 added 2 commits May 13, 2026 14:08

simplify tests

c1df80f

Merge branch 'infer-type' of github-personal:omkar-334/cython into in…

5197949

…fer-type

scoder reviewed May 13, 2026

View reviewed changes

Infer item type of list/set/dict comprehensions

01bea83

Merge branch 'master' into infer-type

479b655

fix slicing and nested indexing of inferred parameterised containers

0e8577f

Merge branch 'master' of github.com:cython/cython into infer-type

7e39d7f

# Conflicts: # Cython/Compiler/ExprNodes.py

scoder reviewed Jun 11, 2026

View reviewed changes

		# DictComprehensionAppendNode is defined later in this module, so
		# dispatch by attribute shape to avoid a forward reference.

		if (target is None or not getattr(target, 'is_name', False)
		or target.entry is None):

	if (target is None or not getattr(target, 'is_name', False)
	or target.entry is None):
	if target is None or not target.is_name or target.entry is None:

Uh oh!

Conversation

omkar-334 commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

omkar-334 commented May 13, 2026

Uh oh!

omkar-334 commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

omkar-334 commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scoder commented May 13, 2026

Uh oh!

scoder May 13, 2026

Choose a reason for hiding this comment

Uh oh!

omkar-334 May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

omkar-334 commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

omkar-334 commented May 13, 2026

Uh oh!

scoder commented May 13, 2026

Uh oh!

scoder commented May 13, 2026

Uh oh!

omkar-334 commented May 18, 2026

Uh oh!

omkar-334 commented May 18, 2026

Uh oh!

omkar-334 commented May 20, 2026

Uh oh!

da-woods commented Jun 6, 2026

Uh oh!

omkar-334 commented Jun 7, 2026

Uh oh!

scoder Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

scoder Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

scoder Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

scoder Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

scoder Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

omkar-334 commented May 13, 2026 •

edited

Loading

omkar-334 commented May 13, 2026 •

edited

Loading

omkar-334 commented May 13, 2026 •

edited

Loading

omkar-334 May 13, 2026 •

edited

Loading

omkar-334 commented May 13, 2026 •

edited

Loading