Allow array denominator in quad_over_lin#3262
Conversation
|
Please use our PR template |
|
Can you explain why we wouldn't include broadcasting between x and y? What do you mean by reduce? |
|
Please install and run the pre commit |
|
My first thought in my review is that the tests seem way more complicated than for any comparable atoms. Can you compress them? |
|
I will look into broadcasting, by reduce I meant sum I thought I had set up pre-commit properly, everything is green locally, I'll resolve. I will simplify the tests to match other similar atoms. |
|
I implemented broadcasting, cut down on the tests and checked my setup of pre-commit (but didn't find any failures). |
| else: | ||
| axes = normalize_axis_tuple(axis, ndim) | ||
|
|
||
There was a problem hiding this comment.
I had thought this would be prohibited my the pre-commit... I need to debug why it isn't firing
There was a problem hiding this comment.
Claude flagged that as rule W293; I tried applying it but got a lot of changes so left it out and removed the whitespace
There was a problem hiding this comment.
Okay we just merged W293 in as part of #3264, thanks for ctching it wasn't enforced!
1af97c8 to
19079b6
Compare
|
|
||
| def lazy_broadcast_to(expr, shape): | ||
| """Broadcast expr to shape only if necessary.""" | ||
| return expr if expr.shape == shape else broadcast_to(expr, shape) |
There was a problem hiding this comment.
broadcast_to is a No-Op in many cases, I'm surprised this is doing anything. What inspired you to add it?
There was a problem hiding this comment.
I initially added it because I broke 2 tests* and thought it caused a benchmark regression. I just reran the affected benchmark a few times with/without and it actually looked like noise, so I removed it (and updated affected tests). I should mention that since this PR would always call broadcast_to in SOC and QP canonicalizers and it doesn't support cpp, these problems will be canonicalized with Scipy. Using lazy_broadcast_to prevented that (returned the variable itself, so didn't add a cpp blocker to quad_over_lin).
Another notable difference is that even when broadcast_to is a no-op, the QP canonicalizer adds an extra expression here, so e.g. users with array x, scalar y (the existing supported behavior) would see different problem_data. This is why I changed the tests which used problem data to adjust for the larger P. I don't know if it affects performance though.
| normalize_axis_tuple(axes, len(bshape)) | ||
| Atom.validate_arguments(self) | ||
|
|
||
| def shape_from_args(self) -> Tuple[int, ...]: |
There was a problem hiding this comment.
Shouldn't AxisAtom do all this for us?
There was a problem hiding this comment.
AxisAtom infers the shape from the first argument, but with broadcasting in quad_over_lin we need both x and y to determine the full broadcast shape, and apply the axis reductions to this shape. For example, if x.shape = (3,), y.shape = (5,4,3); broadcast_shape = (5,4,3) and axis=0 gives (4,3). We'd get () from AxisAtom (scalar result).
Edit: Maybe I misunderstood, did you mean that this behavior should be handled by AxisAtom? I think quad_over_lin is the only multi-argument subclass to AxisAtom
| """ | ||
| return self.args[0].is_pwl() and self.args[1].is_constant() | ||
|
|
||
| def _unbroadcast( |
There was a problem hiding this comment.
Is this the transpose of broadcasting?
There was a problem hiding this comment.
It sums on the extra axes between arr and the target shape, the unbroadcast name doesn't exactly convey this but it's used for lack of a better name. (And because it applies the logic of np.broadcast backwards to determine the axes to sum on)
Description
Closes #1197.
quad_over_lin(x, y)wherexandyare broadcast-compatible.x_ij^2 / y_ij, then reduces alongaxis.AxisAtomconvention:axis=None(default) sums all elements to a scalar.axis=()gives element-wise output with no reduction.axis=kreduces along axisk.Type of change
Contribution checklist