def test_len(df, pdf):
df2 = df[["x"]] + 1
assert len(df2) == len(pdf)
assert len(df[df.x > 5]) == len(pdf[pdf.x > 5])
first = df2.partitions[0].compute()
> assert len(df2.partitions[0]) == len(first)
E AssertionError: assert 100 == 10
E + where 100 = len(Dask DataFrame Structure:\n x\nnpartitions=1 \n0 int64\n10 ...\nDask Name: partitions, 4 expressions\nExpr=Partitions(frame=df[['x']] + 1, partitions=[0]))
E + and 10 = len( x\n0 1\n1 2\n2 3\n3 4\n4 5\n5 6\n6 7\n7 8\n8 9\n9 10)
dask/dataframe/dask_expr/tests/test_collection.py:1718: AssertionError
FAILED dask/dataframe/dask_expr/tests/test_collection.py::test_len - AssertionError: assert 100 == 10
+ where 100 = len(Dask DataFrame Structure:\n x\nnpartitions=1 \n0 int64\n10 ...\nDask Name: partitions, 4 expressions\nExpr=Partitions(frame=df[['x']] + 1, partitions=[0]))
+ and 10 = len( x\n0 1\n1 2\n2 3\n3 4\n4 5\n5 6\n6 7\n7 8\n8 9\n9 10)
This test fails on both pandas 2.3 and 3.0 CI with some frequency:
https://github.com/dask/dask/actions/runs/21682277456/job/62519785526?pr=12271