Add predicted_zero_crossing function generalized for Datavectors #4004

MarloMo · 2022-05-30T03:37:46Z

Proposed changes

The Predicted Zero Crossing function fits a linear function to a set of y_values passed through as a set of DataVector 's at different times and uses the fit to predict what times the value zero will be crossed. This is a step toward measuring how long the surface will take to hit the excision surface so that the control system can adjust the grid to avoid the excision surface from coming out of the apparent horizon surface. The function now takes a DataVector for the x_values and a std::vector<DataVector> for the y_values.

Upgrade instructions

Code review checklist

The code is documented and the documentation renders correctly. Run
make doc to generate the documentation locally into BUILD_DIR/docs/html.
Then open index.html.
The code follows the stylistic and code quality guidelines listed in the
code review guide.
The PR lists upgrade instructions and is labeled bugfix or
new feature if appropriate.

Further comments

knelli2 · 2022-06-03T19:39:01Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.hpp

+ * \brief Predicts the zero crossing of multiple functions contained in a vector
+ * of datavectors.


CamelCase DataVector. Same in the rest of the comments in this commit.

knelli2 · 2022-06-03T19:53:15Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.hpp

+ * Fits a linear function to a set of datavectors for different x_values
+ * and uses the fits to predict what x_values, when the y_values are zero, will
+ * be crossed for each function contained in the datavector.


I think this comment is a bit cumbersome and misleading. From what it looks like, you aren't fitting one linear function. You're fitting N linear functions where N is the size of y_values. And each of these y_values uses the same x_values for the fit. Then you use each fit to determine the zero crossing corresponding to that y_value. Am I understanding that correctly?

Yes, that's right @knelli2

knelli2 · 2022-06-03T19:54:41Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

+  for (size_t i = 0; i < y_values.size(); i++) {
+    const auto coefficients = predictor.fit_coefficients(x_values, y_values);


I think this will calculate the fit coefficients for all y_values at once. No need to recalculate them every iteration in the loop.

knelli2 · 2022-06-03T19:56:28Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.hpp

+ */
+DataVector predicted_zero_crossing_value(


Question: Why did you choose DataVector as the return type? Would std::vector<double> make more sense so that it corresponds to the y_values input? Or does this need to be a DataVector for something else?

I chose a DataVector for no special reason besides just generalizing the function to return a DataVector type to store the zero crossing values. I think std::vector will achieve the same result. Would a vector be a more efficient return type? @knelli2

knelli2 · 2022-06-03T20:05:31Z

tests/Unit/NumericalAlgorithms/Interpolation/Test_PredictedZeroCrossing.cpp

+    y_values.push_back(DataVector{slope[i] * x_values[0] + b[i] + error,
+                                  slope[i] * x_values[1] + b[i] + error,
+                                  slope[i] * x_values[2] + b[i] + error});


Is there a reason why each component has to have the same error? Or can each component have a different error?

No reason besides just adding a small perturbation for each calculation, I think different error values would be better, I can change that if that's preferred. @knelli2

geoffrey4444 · 2022-06-09T23:36:01Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

  return -coefficients[0]/coefficients[1];
 }

+std::vector<double> predicted_zero_crossing_value(


Not a change request necessarily, but I'm wondering why you return std::vector<double> instead of DataVector?

geoffrey4444 · 2022-06-09T23:37:12Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.hpp

+ * same x_values for the fit. Each fit determines the zero-crossing
+ * corresponding to that y_value.
+ */
+std::vector<double> predicted_zero_crossing_value(


No necessarily a change request, but are we sure we wan this to return std::vector<double> and not DataVector?

geoffrey4444 · 2022-06-09T23:40:32Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.hpp

+ * \brief Predicts the zero crossing of multiple functions contained in a vector
+ * of datavectors.
+ *
+ * Fits a set of N linear functions where N is the size of y_values that use the


comma after "functions"

Isn't N the size of the DataVector, not the size of y_values? The size of x_values and y_values are the size of the std::vectors<>, which are the number of points used in each fit. N is the number of fits, which is the DataVector size, right?

knelli2 · 2022-06-16T16:16:45Z

@geoffrey4444 I think we are going back and forth on the return type of this function. I'm not sure which is best. Since @markscheel will be the one using this in size control, maybe he can let us know which return type is preferred, if any.

markscheel · 2022-06-17T17:55:05Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

 }

+std::vector<double> predicted_zero_crossing_value(
+    const DataVector& x_values, const std::vector<DataVector>& y_values) {


I am thinking of the primary use case for this function:
x_values will be a set of times. y_values will be a set of DataVectors, one DataVector for each time, where each DataVector is the characteristic speed evaluated at all points on the surface.
Then the result will be the zero-crossing time for each point on the surface.

So I think x_values should be a std::vector<double>.
Then there could be a check that x_values.size()==y_values.size().

And then I think the return type should be DataVector (with the same size as the DataVector in y_values).

I'm using @gsb76 's LinearLeastSquares function that takes LinearLeastSquares<Order>::fit_coefficients(const T& x_values, const std::vector<T>& y_values) for the arguments to do the fitting for each std::vector<DataVector corresponding to the y_values. Would it be okay to leave as is and just add the check for x_values.size()==y_values.size() and generalize the return type to be just DataVector ? @markscheel

I don't think the current implementation of this function will handle the use case I am thinking of.
This is for two reasons: One is that we don't want the user to be copying things into different structures in order to call the function. But the second reason is because I think you can't just call LinearLeastSquares in the same way you do now. Here's why:

The user (in the use cases I am thinking of) will have a std::vector<double> or maybe even a std::deque<double> for the x_values. This will represent a set of times.

The user will also have a std::vector<DataVector> or maybe a std::deque<DataVector> for the y_values, where each DataVector is the characteristic speed at all the points on a surface. The y_values are indexed by yvalues[x_value_index][datavector_index]. This is the opposite indexing order compared to what LinearLeastSquares expects for its y_values.

markscheel · 2022-06-17T17:55:40Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

  return -coefficients[0]/coefficients[1];
 }

+std::vector<double> predicted_zero_crossing_value(


Above in the case for doubles, add a check that x_values and y_values have the same size?

markscheel · 2022-06-21T23:28:33Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

 }

+std::vector<double> predicted_zero_crossing_value(
+    const DataVector& x_values, const std::vector<DataVector>& y_values) {


I don't think the current implementation of this function will handle the use case I am thinking of.
This is for two reasons: One is that we don't want the user to be copying things into different structures in order to call the function. But the second reason is because I think you can't just call LinearLeastSquares in the same way you do now. Here's why:

The user (in the use cases I am thinking of) will have a std::vector<double> or maybe even a std::deque<double> for the x_values. This will represent a set of times.

The user will also have a std::vector<DataVector> or maybe a std::deque<DataVector> for the y_values, where each DataVector is the characteristic speed at all the points on a surface. The y_values are indexed by yvalues[x_value_index][datavector_index]. This is the opposite indexing order compared to what LinearLeastSquares expects for its y_values.

markscheel · 2022-06-21T23:37:07Z

tests/Unit/NumericalAlgorithms/Interpolation/Test_PredictedZeroCrossing.cpp

+                   slope[i] * x_values[1] + b[i] + error_dist(gen),
+                   slope[i] * x_values[2] + b[i] + error_dist(gen)});
+  }
+


There are 3 sizes in this test:

The size of x_values

The size of y_values

The size of the DataVector inside of y_values

All 3 of those sizes are the same. But only 2 of those sizes should be required to be the same, and the other size should be allowed to be different.
Please write a test where only 2 of the sizes are the same.

For the use case we care about, x_values will be a set of times (probably in a std::vector or maybe in a std::deque). Most likely the size of x_values will be something like 10 or 15.

y_values will be a set of DataVectors, where each DataVector is at one of the times in x_values. So there will be something like 10 or 15 DataVectors. But each DataVector will have probably 100 to 400 components (the components correspond to the points on a Strahlkorper, and for a Strahlkorper of Ylm resolution L, the number of points is roughly L^2).

markscheel · 2022-06-21T23:45:01Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.hpp

+ * \brief Predicts the zero crossing of multiple functions contained in a vector
+ * of datavectors.
+ *
+ * Fits a set of N linear functions, where N is the size of the returned


In the documentation comment, please add a longer description of what is expected in x_values and y_values. In particular, please specify the indexing order of y_values, please specify which sizes in x_values and y_values are supposed to be the same, and please specify the size of the returned value.

markscheel · 2022-06-22T13:26:27Z

One possibility would be something like this (but with more checks added):

DataVector predicted_zero_crossing_value(
    const std::vector<double>& x_values,
    const std::vector<DataVector>& y_values) {
  intrp::LinearLeastSquares<1> predictor{x_values.size()};
  DataVector result(y_values.front().size());
  std::vector<double> tmp_y_values(x_values.size());
  for (size_t i=0;i<result.size();++i) {
    for (size_t j=0;j<tmp_y_values.size();++j) {
      tmp_y_values[j] = y_values[j][i];
    }
    const auto coefficients = predictor.fit_coefficients(x_values,
                                                         tmp_y_values);
    result[i] = -coefficients[0] / coefficients[1];
  }
  return result;
}

And most likely all the std::vectors should be std::deques. I think the ZeroCrossingPredictor needs to store new data as the evolution proceeds (so it would push_back onto the deques) and then when too many times are stored it needs to clear the old data (so it would pop_front off of the deques).

markscheel · 2022-06-22T20:54:02Z

More things that need to be done to make a working ZeroCrossingPredictor (not for this PR, but for future PRs).

The linear least squares function needs to return the error bars for the computed coefficients.
There needs to be a function to combine the values and the error bars, so that it can decide whether a quantity is really about to cross zero and when. See SpEC's ZeroCrossingPredictor.cpp for details on how to do that.
There needs to be a class ZeroCrossingPredictor that will store some set of times and the data associated with those times. That class should probably have two public member functions:
- void add(const double t, const DataVector& data_at_time_t);
- double zero_crossing_time(const double t);
  where the second function returns delta_t, where t+delta_t is the minimum time when any of the data is expected to cross zero. It should return zero if none of the data is expected to cross zero. (or the function could return std::optional<double> instead of double so that zero is no longer a special value).

wthrowe · 2022-06-22T21:00:19Z

It should return zero if none of the data is expected to cross zero. (or the function could return std::optional<double> instead of double so that zero is no longer a special value).

Or maybe infinity?

markscheel · 2022-06-22T21:50:29Z

It should return zero if none of the data is expected to cross zero. (or the function could return std::optional<double> instead of double so that zero is no longer a special value).

Or maybe infinity?

Yes infinity would work too; just need something that you can test for in the size control logic.

MarloMo · 2022-06-24T15:23:46Z

It should return zero if none of the data is expected to cross zero. (or the function could return std::optional<double> instead of double so that zero is no longer a special value).

Or maybe infinity?

Yes infinity would work too; just need something that you can test for in the size control logic.

More things that need to be done to make a working ZeroCrossingPredictor (not for this PR, but for future PRs).

The linear least squares function needs to return the error bars for the computed coefficients.

There needs to be a function to combine the values and the error bars, so that it can decide whether a quantity is really about to cross zero and when. See SpEC's ZeroCrossingPredictor.cpp for details on how to do that.

There needs to be a class ZeroCrossingPredictor that will store some set of times and the data associated with those times. That class should probably have two public member functions:

void add(const double t, const DataVector& data_at_time_t);

double zero_crossing_time(const double t);
where the second function returns delta_t, where t+delta_t is the minimum time when any of the data is expected to cross zero. It should return zero if none of the data is expected to cross zero. (or the function could return std::optional<double> instead of double so that zero is no longer a special value).

Okay, I think I got the basic version of the function to work now (the corrections for this pr). I can start working on the next implementations with Gabe on the LinearLeastSqaures function and newer updates on a new pr. I hope the new doc is more explicit about what the function does. @markscheel

markscheel

The correspondence in sizes is the same as in the original version. x_values and y_values need to be the same size. The DataVectors inside y_values should be a different size.

Please update code and the test.

markscheel · 2022-06-24T19:12:39Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

+  DataVector tmp_y_values{x_values.size()};
+  for (size_t i = 0; i < result.size(); i++) {
+    for (size_t j = 0; j < tmp_y_values.size(); j++) {
+      tmp_y_values[j] = y_values[i][j];


Shouldn't this be y_values[j][i] ? Each i indicates a point in the DataVector, and each j indicates a point in x_values.

Yes, that's correct. I was not clear on how we needed to index through the y_values in my test which confused me when I was setting the tmp_y_values vector.

markscheel · 2022-06-24T19:14:34Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

+  ASSERT(x_values.size() == y_values[0].size(),
+         "The x_values and y_values must be of the same size");


This is the wrong ASSERT, and is the same problem as your original version.

x_values.size() should be equal to y_values.size(). y_values[0].size() should have nothing to do with x_values.size().

Yeah, that makes sense now, I should have an ASSERT that checks for the y_values and x_values to be the same size.

markscheel · 2022-06-24T19:16:21Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

+  DataVector tmp_x_values{x_values.size()};
+  for (size_t i = 0; i < x_values.size(); i++) {
+    tmp_x_values[i] = x_values[i];
+  }


Why do you need tmp_x_values at all? Does it work if you make tmp_y_values a deque , and then you pass two deques into fit_coefficients? (This is what I thought you were going to do; this is why I thought you added a deque to the instantiations in LinearLeastSquares.cpp)

I don't, problem fixed, thank you

markscheel · 2022-06-24T19:20:47Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.cpp

+         "The x_values and y_values must be of the same size");
+  intrp::LinearLeastSquares<1> predictor{x_values.size()};
+
+  DataVector result(y_values.size());


Size of result should be y_values.front().size().

markscheel · 2022-06-24T19:36:56Z

src/NumericalAlgorithms/Interpolation/PredictedZeroCrossing.hpp

+ * The x_values contain the set of values possibly approaching zero while
+ * the y_values contain the DataVectors that contain for example the set of
+ * points on a Strahlkorper. Each element of y_values is a
+ * DataVector of size N. The first index to y_values selects a Strahlkorper
+ * and the second index selects a specfic point on that surface.
+ * The y_values are indexed by [DataVector][point_on_strahlkorper].
+ * The x_values and y_values must be of the same size and type.


Suggested change

* The x_values contain the set of values possibly approaching zero while

* the y_values contain the DataVectors that contain for example the set of

* points on a Strahlkorper. Each element of y_values is a

* DataVector of size N. The first index to y_values selects a Strahlkorper

* and the second index selects a specfic point on that surface.

* The y_values are indexed by [DataVector][point_on_strahlkorper].

* The x_values and y_values must be of the same size and type.

* Each linear function is fit by M points, where M is the size of x_values and y_values.

* The Nth point in the returned DataVector is the value of x for which the linear fit to

* the points (x[:], y[:][N]) [using python notation] crosses y=0.

* The x_values contain M points in the independent variable, while

* the y_values contain M DataVectors that each contain (for example) the set of

* points on a Strahlkorper. Each element of y_values is a

* DataVector of size N. The first index to y_values selects a point corresponding

* to an x_value, and the second index selects a point in the DataVector,

* so the y_values are indexed by [x_value_index][point_in_datavector].

* The x_values and y_values must be of size M.

MarloMo · 2022-06-29T17:39:56Z

Okay, I made all the necessary changes @markscheel

markscheel

Looks good. One small naming change; please squash immediately.

markscheel · 2022-06-30T19:11:00Z

tests/Unit/NumericalAlgorithms/Interpolation/Test_PredictedZeroCrossing.cpp

+  }
+  CAPTURE(slope);
+
+  std::deque<double> b{};


Is there a better name than b?

y_intercept?

y_intercept works

knelli2 · 2022-07-01T19:04:55Z

Everything looks good to me. @MarloMo could you rebase this on the latest develop one more time? There were some recent CI changes that just got merged so not all the required checks have been run.

MarloMo · 2022-07-02T18:13:54Z

Looks like there was a timeout under the unit test Unit.Visualization.Python.Render1D. Is there a way where I can restart the tests to make sure it passes or should this be okay for now?

knelli2 · 2022-07-02T18:55:31Z

Actually I think it was the GH constraints test, but either way it's unrelated to this PR so I wouldn't worry about it.

Side note: @sxs-collaboration/spectre-core-devs, didn't we just increase the timeout of the GH constraints test?

MarloMo changed the title ~~Add predicted_zero_crossing function to take Datavectors~~ Add predicted_zero_crossing function generalized for Datavectors Jun 1, 2022

knelli2 requested changes Jun 3, 2022

View reviewed changes

MarloMo force-pushed the zero_crossing_predictor_datavector branch from 9739743 to b38222e Compare June 8, 2022 21:12

MarloMo requested a review from knelli2 June 9, 2022 20:33

geoffrey4444 requested changes Jun 9, 2022

View reviewed changes

MarloMo force-pushed the zero_crossing_predictor_datavector branch from b38222e to 5532ecf Compare June 11, 2022 20:57

MarloMo requested a review from geoffrey4444 June 11, 2022 20:58

markscheel requested changes Jun 17, 2022

View reviewed changes

markscheel requested changes Jun 21, 2022

View reviewed changes

MarloMo force-pushed the zero_crossing_predictor_datavector branch from 5532ecf to bf7ef4e Compare June 23, 2022 22:42

markscheel requested changes Jun 24, 2022

View reviewed changes

MarloMo force-pushed the zero_crossing_predictor_datavector branch from bae1456 to 19eac89 Compare June 28, 2022 00:12

markscheel reviewed Jun 30, 2022

View reviewed changes

MarloMo force-pushed the zero_crossing_predictor_datavector branch 2 times, most recently from b0e7b96 to 99e2be0 Compare July 1, 2022 03:03

MarloMo force-pushed the zero_crossing_predictor_datavector branch from 99e2be0 to 74a41f0 Compare July 1, 2022 19:55

markscheel previously approved these changes Jul 1, 2022

View reviewed changes

knelli2 previously approved these changes Jul 1, 2022

View reviewed changes

MarloMo added 2 commits July 5, 2022 16:06

Add predicted_zero_crossing function to take Datavectors

ea550a1

Add std::deque<double> instantiation

b1017d3

MarloMo dismissed stale reviews from knelli2 and markscheel via b1017d3 July 5, 2022 23:41

MarloMo force-pushed the zero_crossing_predictor_datavector branch from 74a41f0 to b1017d3 Compare July 5, 2022 23:41

MarloMo requested review from knelli2 and markscheel July 6, 2022 18:11

markscheel approved these changes Jul 7, 2022

View reviewed changes

knelli2 approved these changes Jul 7, 2022

View reviewed changes

geoffrey4444 approved these changes Aug 1, 2022

View reviewed changes

kidder merged commit 658428e into sxs-collaboration:develop Aug 1, 2022

markscheel mentioned this pull request Aug 2, 2022

Change PredictedZeroCrossing algorithm. #4137

Merged

3 tasks

		* \brief Predicts the zero crossing of multiple functions contained in a vector
		* of datavectors.

		for (size_t i = 0; i < y_values.size(); i++) {
		const auto coefficients = predictor.fit_coefficients(x_values, y_values);

		ASSERT(x_values.size() == y_values[0].size(),
		"The x_values and y_values must be of the same size");

- * The x_values contain the set of values possibly approaching zero while
- * the y_values contain the DataVectors that contain for example the set of
- * points on a Strahlkorper. Each element of y_values is a
- * DataVector of size N. The first index to y_values selects a Strahlkorper
- * and the second index selects a specfic point on that surface.
- * The y_values are indexed by [DataVector][point_on_strahlkorper].
- * The x_values and y_values must be of the same size and type.
+ * Each linear function is fit by M points, where M is the size of x_values and y_values.
+ * The Nth point in the returned DataVector is the value of x for which the linear fit to
+ * the points (x[:], y[:][N]) [using python notation] crosses y=0.
+ * The x_values contain M points in the independent variable, while
+ * the y_values contain M DataVectors that each contain (for example) the set of
+ * points on a Strahlkorper. Each element of y_values is a
+ * DataVector of size N. The first index to y_values selects a point corresponding
+ * to an x_value, and the second index selects a point in the DataVector,
+ * so the y_values are indexed by [x_value_index][point_in_datavector].
+ * The x_values and y_values must be of size M.

Add predicted_zero_crossing function generalized for Datavectors #4004

Add predicted_zero_crossing function generalized for Datavectors #4004

Uh oh!

Conversation

MarloMo commented May 30, 2022

Proposed changes

Upgrade instructions

Code review checklist

Further comments

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MarloMo Jun 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knelli2 commented Jun 16, 2022

Uh oh!

markscheel Jun 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markscheel Jun 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markscheel commented Jun 22, 2022

Uh oh!

markscheel commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wthrowe commented Jun 22, 2022

Uh oh!

markscheel commented Jun 22, 2022

Uh oh!

MarloMo commented Jun 24, 2022

Uh oh!

markscheel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MarloMo Jun 4, 2022 •

edited

Loading

markscheel Jun 17, 2022 •

edited

Loading

markscheel Jun 21, 2022 •

edited

Loading

markscheel commented Jun 22, 2022 •

edited

Loading

MarloMo Jun 30, 2022 •

edited

Loading