Skip to content

Benchmark parameters I#56

Merged
rmartinho merged 22 commits into
libnonius:develfrom
arximboldi:parameters
Sep 1, 2016
Merged

Benchmark parameters I#56
rmartinho merged 22 commits into
libnonius:develfrom
arximboldi:parameters

Conversation

@arximboldi

@arximboldi arximboldi commented Aug 3, 2016

Copy link
Copy Markdown
Contributor

Hi!

I managed to get started with the feature. There is work to be done, but I open the pull request in case you want to do an early review so I can change my approach if needed. Note that this branch depends on the changes in #54

Here's the progress:

  • Benchmark writers can declare one or more parameters of any streamable type.
  • Benchmark parameters can be set to a value from the command line.
  • The benchmark runner can be told from the command line try a whole range (linear or exponential) of range values for one (and only one) parameter.
  • Improve the handling of parameter types.
  • Testing (I know, one should do that first, but the areas of the code I needed to touch did not have any tests yet so I just hacked my way through.)
  • Update the HTML reporter to show a graph with the different means and variances for the test run with different parameters.

Nice to have that I might not do:

  • Update documentation mentioning this feature.
  • Do something special for other reporters when trying a parameter range.
  • Improve parameter parsing on the command line so one can escape the separator : in string parameters.
  • Add runs through ranges of values over multiple parameters, using support for 3D and/or contour plots from Plotly.
  • Explicitly defining benchmark suites (capturing parameters at suite level for brevity)

RFC @rmartinho

@rmartinho

Copy link
Copy Markdown
Collaborator

The build server cannot build this PR because the author is not whitelisted. A meatbag will have to look at it.

This is an automated message.

@rmartinho

Copy link
Copy Markdown
Collaborator

Thanks for putting this up so I can review it. I hope I can get some time on sunday. If you want you can rebase on devel now that I merged the other PR.

The user may declare parameters by doing, at the top level:
```cpp
    NONIUS_PARAM(name, <default_value>);
```

This declares a parameter.  Parameters can be then passed to the `main`
program by passing arguments of the form:
```
    -p <name>:<value>
```

The parameters have to be accessed from the chronometer.  Since
parameters can have any type and are string-addressed, they can only be
finally parsed or unboxed and looked up in the benchmark.  To stop users
from mistakenly adding that cost to their tests, they are forced to
extract the parameters before, from the chronometer.  Of course they can
cheat by capturing the chronometer, but the documentation should guide
users in the right direction.  A good example:
```
NONIUS_BENCHMARK("push_back vector", [](nonius::chronometer meter)
{
    auto n = meter.param("size");
    auto storage = std::vector<std::vector<char>>(meter.runs());
    meter.measure([&](int i) {
        auto& x = storage[i];
        for (auto j = 0; j < n; ++j)
            x.push_back(static_cast<char>(j));
    });
})
```
The parameter name is now required to be a string already.  There is no
fundamental reason why parameter names should be limited to identifier
names rules.  The fact that `NONIUS_DETAIL_UNIQUE_NAME` already exists
is convenient here.

This changes syntax for the user from:
```cpp
    NONIUS_PARAM(name, 42);
```

To:
```cpp
    NONIUS_PARAM("name", 42)
```

Note the semicolon is not needed anymore either.
The syntax is:
```
    -p <name>:<oper>:<init>:<delta>:<steps>
```

This tells that the benchmark should be run with `<steps>` different
values for parameter `<name>`.  Each value is computed as:
```
    next = last <oper> deta
```

Where `<oper>` may be `+` or `*` as for testing with linear or
exponential increments (the later is useful when checking logarithmic
algorithms or subtle non-linearities in the "constant factors" of an
implementation).
The old code was convoluted, with too many assertions and implicit
assumptions.  Instead of generating all the samples that we want to try
in the argument parsing, we delay this logic to a more appropriate place
later, without loss of information.

The only drawback of the new configuration representation is that it may
be less flexible, as the previous one allowed to easily add passing
explicitly all the various parmeters to try.  We can, though, easily add
this later by adding more options to the configuration.
The functions `go_benchmark` and `go_param` where extracted, since the
function had grown too big to my taste after the parameter running logic
had been introduced.

Also, the main loop has been simplified to avoid the need to `continue`,
which is more structured IMHO.
We always store parameters generically as `std::string`.  The user
implicitly suggests a concrete value type with the type of the default
value of the parameter.  We use that type when we need to do arithmetic
on the parameter and also when checking that it parses correctly when
passed from the command line.  We do this by casting the strings back to
the original `T` using the `param_spec` interface.

This might seem overkill.  Another approach could be to type erase the
concrete values once intially parsed from the command line, using a
`boost::any`-like type erasure class.  However that approach is less
flexible, because it requires the user to provide the specific concrete
type when calling `chronometer::param<T>()` and just a mistake in
signedness or whatever could be fatal.  Just parsing everything into and
back from strings is more forgiving on numeric types, which is the most
common case.

This commit adds a new `example7` that where using ranged parameter
would not work correctly previously.
@arximboldi arximboldi force-pushed the parameters branch 5 times, most recently from 222ae88 to 6ce1fa3 Compare August 6, 2016 12:07
In the last commit, we broke parameters of types that do not support `+`
and `*` operators.  Some of these parameters might be useful, for
example `std::string`.

Now, parameters may or may not support these operations.  When they do
not support, the benchmarks will fail when we try to run the parameter
through a range with those operations.

Funny, we can do:
```
   ./bin/examples/example6 -p string:+:hola-mundo:-yeah:10
```

But we can't:
```
   ./bin/examples/example6 -p string:*:hola-mundo:-oh,no!:10
```
This commit refactors a lot of the code added in this branch. The
original goal was to reverse how `go()` runs through various
parameter values.  Before it did like:

    for each benchmark
        for each parameter
            run-benchmark()

Now it is more like

    for each parameter
        for each benchmark
            run-benchmark()

This new structure should make writing reporters easier, since it better
fits how the data could be presented to the user in meaningful ways --
we'll see more of this in later commits.

A couple of more refactorings has sneaked in this commit:

- `param_map` and `param_registry` etc. are not in `detail::` anymore.
  This is to better fit other parts of the library (e.g. benchmark,
  reporter) that are not in `detail` and arguably at the same level of
  abstraction from the user API POV.

- A couple of functions now have a more "data driven" approach, trying
  to reuse previously computed data more often, as to improve the
  overall performance.
The old API for benchmarks that have fixtures is like this:
```cpp
NONIUS_BENCHMARK([] (nonius::chronometer meter){
  // setup code
  m.measure([&] {
      // benchmark code
  })
});
```

That API is nice, but since it has to assume that the benchmark code can
always mutate the setup, the setup can not be reused across samples.
This is ok in general, but for benchmarks that have expensive setups
relative to their measured time, time might be wasted uneedesly.

Our alternative syntax is like:
```cpp
NONIUS_BENCHMARK([] (nonius::parameters meter){
  // immutable setup code
  return [=] {
      // benchmark code
  }
});
```

Or even further:
```cpp
NONIUS_BENCHMARK([] (nonius::parameters params){
  // immutable setup code
  return [=] (nonius::chronometer meter) {
      // mutable setup code
      meter.measure([&] {
          // benchmark code
      })
  }
});
```

The idea is that when the benchmark function can take a `parameters`
argument, it is assumed to be a factory that then returns the actual
benchmark function.  This factory function is called *only once* for a
given set of parameters (in `prepare(...)`) and the resulting benchmark
function is kept in the `execution_plan` and reused for collecting all
the samples.

Furthermore, this syntax may be more intuitive for people that need
parameters, but do not need a chronometer.
When trying to figure out how long to run a test to get a meaningful
value, we may fall into an infinite loop when the whole benchmark has
been optimized away.  We now try to detect this case by hard-limiting
the number of iterations we do in the loop.

This partly solves libnonius#52.  A complete solution involves properly
deoptimizing the return value of the benchmark function.
The exception message would get mixed together with other lines of
messages creating confusion.
Now all the logic of deciding which parameters to execute is contained
in `generate_params`.  We removed the "feature" that `parameters` looks
up default values.  This way the code we remove dependencies on global
state, and make the feature more testable.  `generate_params` now merges
the default values with the fixed values with the values generated for a
parameter run.
The command line parser would only look at the first instance of an
argument.  This would prevent us from passing multiple parameters to the
benchmark runner. Examples that did not work but now do work:
```
    # Would only set string=foo, discarding size
    example6 -p string:foo -p size:42
```
```
    # Would only run size:42,52,..., discarding string
    example6 -p size:+:42:10:10 -p string:foo
```
@arximboldi arximboldi changed the title [WIP] Benchmark parameters Benchmark parameters Aug 6, 2016
@arximboldi

Copy link
Copy Markdown
Contributor Author

Thanks! No problem with the timing, take your time. In the meantime I will be using the branch in the project that motivated the change, which might help polishing it.

Comment thread examples/example7.c++
NONIUS_PARAM("x", 1.0)

template <typename Fn>
struct volatilize_fn {

@rmartinho rmartinho Aug 9, 2016

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is here because of #52, right? If it works, I'd say it should be part of what nonius does when running the code and not in the examples. Basically, I think that any such technique that can be applied generically should be built-in.

I have been researching a bit on optimizer control, and I put up some code for that (#57). If you could have a look at it, that'd be great.

@rmartinho rmartinho added this to the Parameterized benchmarking milestone Aug 9, 2016
@rmartinho rmartinho self-assigned this Aug 9, 2016
@rmartinho rmartinho modified the milestones: Boostless, Parameterized benchmarking, RELEASE: 1.2 Aug 9, 2016
@rmartinho

rmartinho commented Aug 9, 2016

Copy link
Copy Markdown
Collaborator

It might be important for nonius to know which benchmarks parameters are used by each benchmark. As is, all benchmarks are repeated for all parameter values, even if they don't depend on them. This is fine if you group sets of benchmarks each in its own runner but not so nice if you put different ones together. I'm not sure how important such a use case would be, but I see having to split benchmarks across multiple runners as an inconvenience given that the runners have filtering capabilities.

I'm not sure how to have that work, but maybe some kind of benchmark grouping is needed. Just bringing it up for discussion.

@arximboldi

arximboldi commented Aug 9, 2016

Copy link
Copy Markdown
Contributor Author

@rmartinho with regard to grouping. I also thought of that. The best solution I thought of so far is to add a notion of "suite". Parameters can then be scoped to a suite. Furthermore, this syntax might make it easier to use parameter objects as proxy to query the parameters without runtime cost nor global state. Here is an example of how benchmarks could look like:

NONIUS_SUITE("push_back", [] {
    NONIUS_PARAM(N, std::size_t{10});
    NONIUS_BENCHMARK("vector", [=] {
        for (auto i = 0u; i < N; ++i) 
              ...
    })
    NONIUS_BENCHMARK("string", [=] {
        for (auto i = 0u; i < N; ++i) 
             ...
    })
})

I was hesitant to implement that in this branch because it requires even deeper changes to the runner and registration system, possibly breaking compatibility. If you like it I can try implementing it to see how it feels...

@ThePhD

ThePhD commented Aug 9, 2016

Copy link
Copy Markdown
Contributor

If I could throw my two cents in here...

Y E S .

Grouping is a BIG thing for me. I was going to implement it myself when I had time, because I was curious to see how well many of the library I benched performed when you scaled up a certain parameter, and when you compared it across libraries. Replacing the x-axis value of "sample #" for the default graph with instead a kind of "grouping A", "grouping B", and (for the HTML graph bits) doing a scatter-bar there in to represent all the samples of that grouping would be immensely helpful.

(I ended up using matplotlib with python to create a fancier graph to do what I want: this is what it ended up looking like: http://sol2.readthedocs.io/en/latest/benchmarks.html)

@rmartinho

rmartinho commented Aug 9, 2016

Copy link
Copy Markdown
Collaborator

Let's try that on a separate branch (though you can branch off of this one if you want; this will definitely be merged). See issue #59.

@arximboldi

Copy link
Copy Markdown
Contributor Author

Cool. I just pushed a little fix on this branch for an old bug. I will move further experimenting with syntax to a new branch.

@arximboldi arximboldi changed the title Benchmark parameters Benchmark parameters I Aug 10, 2016
@rmartinho rmartinho mentioned this pull request Aug 11, 2016
Comment thread include/nonius/param.h++
@@ -0,0 +1,176 @@
// Nonius - C++ benchmarking tool
//
// Written in 2014 by Martinho Fernandes <martinho.fernandes@gmail.com>

@rmartinho rmartinho Aug 12, 2016

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, I see you copied these from some existing stale ones. Since this is PD this notice is somewhat irrelevant, but if you want to claim credit for this file, you're welcome to do so. Otherwise I'll probably do a sweeping change in the repo to something in the spirit of "by the nonius contributors nonius@rmf.io"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe, yup!

I think long term it's easiest to put something like "by the nonius contributors nonius@rmf.io, see AUTHORS file." Then the AUTHORS file may list maintainers, contributors, etc.

@rmartinho rmartinho merged commit ebb807a into libnonius:devel Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants