Residual Calculation Fix and Miscellaneous Improvements#45
Residual Calculation Fix and Miscellaneous Improvements#45pberkes merged 10 commits intopberkes:masterfrom
Conversation
In testing, only 1 in 100 tests failed. Improvement over previous.
|
Hi @pberkes ! Apologize, I realize this is quite a lengthy PR. This PR fixes a somewhat subtle bug, but still has a significant impact on the classification accuracy. I wanted to make sure to include a mathematical breakdown. This PR is similar to PR #44 in that both fix issues dealing with correctly accounting for the logarithmic transform used (only) by The difference is PR #44 is a fix for Unlike PR #44, this PR actually corrects the behavior of how bigo decides the classification of a given function. |
|
Update on this? |
|
I'm sorry, @TheMatt2 , I've been traveling, I'll have a look this week |
|
Hi @TheMatt2 , thanks for the PR! Good catch with the residuals, and the Python optimization problem is fun. My main concern is that if I set As for design, I'm not a big fan of having that behavior controlled by a class attribute. The solutions I propose are 1) recalculate the residuals always; there's some overhead but it's the cleanest; or 2) define a method I'll add other minor comments inline |
|
Hi @pberkes , Apologize for lumping multiple issues into this 1 PR, in hindsight, I should have used branches to track the changes separately. Now I know for next time. I follow using
To denote that the correct behavior is only going to be one of these two behaviors, I thought a class level boolean value would be appropriate. Technically, the code could check if Obviously always recalculating residuals solve this problem, but it seems a waste to always redo the calculation when the majority of complexity classes can just use the value from I think keeping it as a class variable would be best, though I accept your recommendation to name it Might be worth acknowledge that there is another option to explicitly override class Polynomial(ComplexityClass):
def fit(self, n, t):
""" Fit complexity class parameters to timing data.
Input:
------
n -- Array of values of N for which execution time has been measured.
t -- Array of execution times for each N in seconds.
Output:
-------
residuals -- Sum of square errors of fit
"""
# Explicitly calculate residuals
# Time transform means residuals isn't comparable
super().fit(n, t)
## Alternatively
# ComplexityClass.fit(self, n, t)
ref_t = self.compute(n)
residuals = np.sum((ref_t - t) ** 2)
return residuals |
|
As for the test cases passing even if |
|
I am unable to replicate the tests passing when When I run the test set I (correctly) get the failure message: This failure is expected if residual calculation was not performed correctly. Note that this test directly calls |
|
I'm ok with renaming |
|
I see you are correct. Good catch. Apologize, I had thought I had tested that, but I guess I accidentally tested the Exponential case twice. For the test case, the answer for Polynomial with >>> class_
<class 'big_o.complexities.Polynomial'>
>>> np.sum((y - ref_y) ** 2)
5.0832678484432925e-18
>>> residuals
1.7620118668727588e-28 |
Fixes an issue that existing test did not detect incorrect residuals in some cases. The issue was the residuals were too small.
Makes linear test use a larger range. This seems to decrease the chance of a misclassification on a system with noisy timing.
Previously was a skip
TheMatt2
left a comment
There was a problem hiding this comment.
I think that covers all requested changes.
Let me know if there is anything else you would like me to change.
|
LGTM, thank you! |
|
Thanks everybody for providing this fix! I was going down the same rabbit hole as @TheMatt2 and wondering why it would not classify my linear values as a linear complexity class, but as polynomial instead. The culprit now is: This was never released to a current big-O pip package, so everybody might still be running into this problem (just like I am). Is there a timeline on when the next release is due @pberkes ? (I can see there is #46) |
Fixes an issue with how residuals are calculated and improves the reliability of test cases.
Residual Calculation Fix
tl;dr
The residual value returned from
np.linalg.lstsq(x, y, rcond=-1)for Polynomial andExponential classes can not be meaningfully compared with the other complexity classes.
big_O/big_o/complexities.py
Line 36 in 0ed1e3d
The issue is the transform used in Polynomial and Exponential complexity classes to correctly
generate the coefficients causes the residual calculation to be invalidated.
big_O/big_o/complexities.py
Lines 189 to 190 in 0ed1e3d
The solution is to recalculate the residuals manually:
Steps to Reproduce
Here is an example that demonstrates big_O misclassifying a quadratic function because of this issue.
As for can see in the calculated residuals, the best matching functions should be Polynomial with a residual of
1.32, followed by Exponential with a residual of6.14, and followed distantly by Quadratic with a residual89.70.This is surprising, since the original function is Quadratic, yet Polynomial and Exponential were found to be better matches.
If we look at the graph, we can see this is an issue.
Blue is the original function.
Green is the Quadratic complexity, which seems to match the original input quite well.
Orange is the Polynomial complexity, which seems to trail off of the original function at the tail end.
Red is Exponential complexity, which does not seem to match the original function very well at all, yet had a lower residual than Quadratic.
This is an issue.
big_o.infer_big_o_classshould have identified Quadratic as the best complexity.The Math
For each complexity, there are functions$F(n_i)$ and $T(t_i)$ that transforms $n_i$ $t_i$ before they are passed to
and
np.linalg.lstsq().The result of$c$ and $r$ .
np.linalg.lstsq()arecoeff, which is referenced asresidualswhich is referenced at
The other outputs from
np.linalg.lstsq()are unused.The residual from
np.linalg.lstsq()is calculated as the sum of squares of the calculated coefficiences.This can be expressed by the following equation for a given$F(n_i)$ and $T(t_i)$ with the resulting coefficience $c$ :
However the equation for error, the distance between the equation represented by the complexity and the inputted function is calculated by applying$T^{-1}(t')$ to $t_i'$ instead of applying $T(t)$ to $t_i$ .
For must of the complexity functions, where$T(t) = t$ , this error calculation is equal to the residual calculation.
But these are not equal for Polynomial and Exponential. This is shown in the table below:
So for Polynomial and Exponential, the residual must be recalculated to be equal to the error.
Test Cases Improvements
From testing, it seems that specifically the test
big_o.test.test_big_o.TestBigOseems to fail often.In testing, I found this test passed in 29 out of 100 tests.
Specs below:
The problem appears to be the quadratic function being used for testing is too fast for
accurate timing to be achieved.
The code being used is:
big_O/big_o/test/test_big_o.py
Lines 50 to 54 in 0ed1e3d
The problem with this function, is Python optimizes out the multiplication, leaving only two empty loops.
This can be demonstrated by using dis to look at the disassembly:
Python has optimized the function down to just the two for loops, which is fast enough to cause the classification to be inconsistent.
This is fixed by saving the results to variables, and doing some simple addition.
It also adds a constant component
Other Improvements
Simplicity Bias
Add
simplicity_biasas a parameter ofbig_o.infer_big_o_class()to allow for the user to change the bias to choosing a simpler complexity when the residuals are close.big_O/big_o/big_o.py
Line 102 in 0ed1e3d
big_o.test.test_big_o.TestBigO.test_big_o()Change Numpy sort toheapsortfrommergesortAs reflected in the comments
mergesortcan appear like a more linear function depending on the input set.big_O/big_o/test/test_big_o.py
Lines 56 to 60 in 0ed1e3d
The Numpy documentation shows a possible reason for this, that the merge sort may be implemented by a radix sort or Tim sort. Radix sort is a linear time sorting algorithm and Tim sort, under the right situation, can appear to be a linear sorting algorithm.
https://numpy.org/doc/stable/reference/generated/numpy.sort.html
This commit changes the sort to a heap sort, which is reliable enough that it is not necessary to set the random order of the list that is sorted.
big_o.test.test_big_o.TestBigO.test_measure_execution_time()setn_timingsto5for more reliable testingbig_O/big_o/test/test_big_o.py
Lines 17 to 20 in 0ed1e3d
It appears that the test is more reliable if the timing is run multiple times, which is done by setting
n_timings=5Allow list parameters to passed to
complexity.fit()Currently
ComplexityClass.fit()requires two Numpy arrays to be passed as arguments.However, that often requires the calling code needs to explicitly import Numpy to create the arrays.
This PR adds a call to
np.asanyarray()to the inputs incomplexity.fit()so calling code can specify inputs as normal Python iterables.Test cases for this are included.