CI: Run benchmarks #8779

larsoner · 2018-04-27T16:01:45Z

As mentioned in #8769, tries adding asv to the CircleCI build

~~on:~~

~~1. All commits to master~~
~~2. PRs that touch benchmark/ files~~
~~3. PR commits that contain [circle benchmark]~~

~~Also fixes a broken URL.~~

Eventually we probably want to do artifact deployment for master builds. But hopefully someone who knows more about asv could do that part.

First let's see if it actually works in a reasonable amount of time (e.g., under 2 hours).

.circleci/config.yml

tylerjereddy · 2018-04-27T16:47:10Z

.circleci/config.yml

+              ram=$(awk -F"[: ]+" '/MemTotal/ {print $2;exit}' /proc/meminfo)
+              os=$(uname -sr)
+              printf "{\n\"CircleCI\": {\n\"arch\": \"${arch}\",\n\"cpu\": \"${cpu}\",\n\"machine\": \"CircleCI\",\n\"os\": \"${os}\",\n\"ram\": \"${ram}\"\n},\n\"version\": 1\n}" | tee ~/.asv-machine.json
+              python -u runtests.py --bench


I'll be interested to hear what @pv and others think about this -- in the past I think discussions about adding asv stuff directly to CI were not favorable, but perhaps infrastructure has advanced sufficiently now.

I have been doing ~2-hour doc builds on master commits on CircleCI on a repo with a similar master-commit-rate for a year or two now. Assuming the time here is in this ballpark and that the number of benchmark-related PRs stays in some reasonable range, this should be fine.

In the future if we get too many PRs, we can start limiting it to master commits only.

larsoner · 2018-04-27T18:40:56Z

It got 91% done then timed out on "No output for 10 minutes":

https://circleci.com/gh/scipy/scipy/6735?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

@tylerjereddy do you know if there is some particularly slow benchmark that could be split into multiple benchmarks to avoid this?

tylerjereddy · 2018-04-27T19:25:07Z

@larsoner What about adding the -q (for "quick") flag to the benchmark arguments in the CI context? It looks like runtests.py builds up bench_args from the items in extra_argv and runs asv continuous.

From the linked docs there, -q is specified to only run the benchmark functions once each, which is not acceptable for proper benchmarking, but is well-suited for executing the code & finding if there are errors in new or modified benchmarks.

.circleci/config.yml

tylerjereddy · 2018-04-27T21:12:16Z

I think you'll need the brackets around the [-q] as well, believe it or not, but we can see what the ci says.

larsoner · 2018-04-28T02:11:42Z

Okay that only took 40 minutes. I think the doc build is around 30.

The output does not seem too useful:

https://6741-1460385-gh.circle-artifacts.com/0/html-benchmarks/index.html

tylerjereddy · 2018-04-28T02:20:24Z

Looks like there may be 1 or 2 issues (a few Exceptions are raised) in the text output that I see for the benchmarks in the Circle CI console. Does it make sense that only about 12 benchmarks are performed? I guess I normally use asv directly instead of our wrapper infrastructure for driving asv, but that certainly seems like a small number of benchmarks.

.circleci/config.yml

pv · 2018-04-28T02:22:30Z

.circleci/config.yml

+            . venv/bin/activate
+            export SHELL=$(which bash)
+            if ! git remote -v | grep upstream ; then git remote add upstream git://github.com/scipy/scipy.git; fi
+            changed=$(git diff --name-only $CIRCLE_BRANCH $(git merge-base $CIRCLE_BRANCH upstream/master));


I seem to recall circleci had some syntax to do exactly this (run only if changs under some path), maybe useful to check its docs

Can't find an official implementation, just ideas

tylerjereddy · 2018-11-26T19:02:11Z

NumPy has been doing a quick asv run in their Travis CI for quite some time now.

In particular, they use asv dev which automatically includes i.e., --quick. That's probably just intended as a sanity check for not breaking the benchmark code mechanically.

larsoner · 2020-09-02T18:56:38Z

I think you'll need the brackets around the [-q] as well, believe it or not, but we can see what the ci says.

In particular, they use asv dev which automatically includes i.e., --quick. That's probably just intended as a sanity check for not breaking the benchmark code mechanically.

I went with the dev command and --bench [-q] to at least get it to run in some sensible time. Locally with the CircleCI commands I can get it to run, and asv preview opens a website, but there are no results. I wonder if you need two runs or something :(

larsoner · 2020-09-03T14:51:22Z

Okay it runs and renders! The output is not terribly useful, I can't find any benchmarks:

https://21265-1460385-gh.circle-artifacts.com/0/html-benchmarks/index.html

It would be good if some asv expert could fix this at some point. But this at least:

Clears the CircleCI hurdles to running benchmarks
Runs the -q tests, so makes sure nothing breaks on each commit
Adds some stuff that will allows [ci skip] and [skip ci] and [skip github] to skip running the GitHub checks -- I think it needs to be in master to take effect, though. Locally on my fork it seems to work:
- https://github.com/larsoner/scipy/actions/runs/237942156
- https://github.com/larsoner/scipy/actions/runs/237948372

This is now ready for review/merge from my end.

rgommers · 2020-09-04T11:49:58Z

There's a bunch of errors during the asv run (e.g. `NameError: name 'magic_square' is not defined), but CI is green. Would be good to make CI fail in this PR, then have a separate PR to fix the bugs that we can merge straight away, and then rebase this PR onto master.

rgommers · 2020-09-04T11:50:19Z

Running benchmarks in only a couple of minutes in CI is great by the way!

grlee77 · 2020-09-04T13:29:54Z

.circleci/config.yml

+            export PYTHONPATH=$PWD/build/testenv/lib/python3.7/site-packages
+            cd benchmarks
+            asv machine --machine CircleCI
+            asv --config asv.conf.json dev -m CircleCI --python=same --bench [-q]


I don't understand this [-q] regular expression? It has been awhile since I worked with regular expressions, but doesn't this restrict to only test cases containing a q or a - somewhere in the name or parameters?

With --bench [-q] 41 benchmarks are found, but without this argument asv finds 234 benchmarks.

I see above that this -q was probably intended to cause "quick" operation but it is instead being interpreted as a regular expression corresponding to the --bench argument.

According to the ASV docs asv dev is already equivalent to asv run --quick --show-stderr --python=same, so I don't think -q needs to be specified to get quick operation.

Unfortunately, removing it is going to make running the benchmarks take much longer as many that didn't match the regular expression were being skipped.

I see above that this -q was probably intended to cause "quick" operation but it is instead being interpreted as a regular expression corresponding to the --bench argument.

Indeed, I assumed that this was intentional from comments above but I think I misinterpreted them. I agree it's not great to skip that many. I guess we'll see how long it takes to run all of them.

Maybe we'll need to somehow mark some benchmarks as slow so that they aren't run on CircleCI

#12732 would give us some ideas for how to do this maybe

larsoner · 2020-09-09T15:45:39Z

There's a bunch of errors during the asv run (e.g. `NameError: name 'magic_square' is not defined), but CI is green. Would be good to make CI fail in this PR, then have a separate PR to fix the bugs that we can merge straight away, and then rebase this PR onto master.

Almost all benchmarks have the boilerplate code along the lines of:

try:
    from scipy.optimize.tests.test_linprog import lpgen_2d, magic_square
    ...
except ImportError:
    pass

I think this is intentional. I guess that we could have an env var SCIPY_ALLOW_MISSING_BENCHMARKS that defaults to 1 but when set to 0 (which I could do on CircleCI) raises an ImportError, halting the run.

larsoner

Fixed the bugs with importing. It'll now raise an error when SCIPY_ALLOW_BENCH_IMPORT_ERRORS=0, which it is in CircleCI (and some failures are visible, for example here) so it seems to be working.
Takes about ~20 minutes to run (which happens in parallel with the doc build, which is ~16 min) after the ~10 minute build time, so hopefully in the same ~30 minute total range as the other elements of the test suite.
The artifact is still worthless, but that can be a follow-up PR for someone who knows how to make it useful. But at least the CI mechanics should be sorted out.

Ready for review/merge from my end!

larsoner · 2020-09-09T19:44:14Z

benchmarks/benchmarks/optimize.py

-                  "Note that it can take several hours to run; intermediate output\n"
-                  "can be found under benchmarks/global-bench-results.json\n"
-                  "You can specify functions to benchmark via SCIPY_GLOBAL_BENCH=AMGM,Adjiman,...")
-            raise NotImplementedError()


@pv I had to get rid of this -- it made every of the (hundreds?) of skipped ones emit this message, which made the CircleCI log and local terminal output overly verbose. I refactored it to be code comments above instead

larsoner · 2020-09-09T19:47:51Z

.circleci/config.yml

+            asv machine --machine CircleCI
+            export SCIPY_GLOBAL_BENCH_NUMTRIALS=1
+            export SCIPY_ALLOW_BENCH_IMPORT_ERRORS=0
+            time asv --config asv.conf.json dev -m CircleCI --python=same --bench '^((?!BenchGlobal).)*$'


I also explicitly disallow running BenchGlobal benchmark because even with all tests skipped, just having the BenchGlobal.track_all function locally run takes ~1 min of time, even though it's just giving nan for all results. I tried locally various ways of avoiding this, but it must be something in the asv mechanics that makes it take so long.

Experiencing the same problem with the new QuadraticAssignment -- it takes 30 seconds just to set up the tests even if none are run. So I'll explicitly skip these, too.

larsoner · 2020-09-11T17:02:56Z

CIs fixed, all green (sorry they weren't before)

larsoner · 2020-11-02T13:05:46Z

Rebased to get things green. It showed that #12775 two months ago added some very slow benchmarks. I made a comment in the code that it should use is_xslow at some point, though it has the same "benchmark collection" slowdown I mention above.

It would be nice to merge this sooner rather than later so that these problems are exposed in PRs directly, though.

rgommers

LGTM and all green, let's give it a go! Thanks @larsoner!

tylerjereddy reviewed Apr 27, 2018

View reviewed changes

.circleci/config.yml Outdated Show resolved Hide resolved

pv reviewed Apr 28, 2018

View reviewed changes

lagru mentioned this pull request May 6, 2018

Benchmark code location scikit-image/scikit-image#3059

Closed

larsoner mentioned this pull request Nov 26, 2018

DOC: add a "big picture" roadmap #9537

Merged

rlucas7 added the Benchmarks Running, verifying or documenting benchmarks for SciPy label Dec 29, 2019

larsoner force-pushed the asv branch 2 times, most recently from a3a9eca to 53f9eca Compare September 2, 2020 18:52

larsoner mentioned this pull request Sep 2, 2020

BENCH: add scipy.ndimage.interpolation benchmarks #12802

Merged

larsoner force-pushed the asv branch from 010bc51 to 4882210 Compare September 3, 2020 00:00

larsoner changed the title ~~WIP: Run benchmarks~~ CI: Run benchmarks Sep 3, 2020

larsoner added this to the 1.6.0 milestone Sep 3, 2020

grlee77 reviewed Sep 4, 2020

View reviewed changes

larsoner commented Sep 9, 2020

View reviewed changes

larsoner mentioned this pull request Oct 5, 2020

CI: don't run GitHub Actions on fork or in cron job #12921

Merged

ENH: Run benchmarks

cbc9d53

larsoner force-pushed the asv branch from e13f36f to cbc9d53 Compare November 2, 2020 10:33

TST: Ping CircleCI

cce5142

CI: Avoid slow

9fd95f9

rgommers approved these changes Nov 8, 2020

View reviewed changes

rgommers merged commit 4fea5d5 into scipy:master Nov 8, 2020

larsoner deleted the asv branch November 9, 2020 00:21

mdhaber mentioned this pull request Sep 7, 2021

DOC: update detailed roadmap #14700

Merged

j-bowhay mentioned this pull request Feb 25, 2022

MAINT: benchmarks: Update out of date instructions for safe benchmark imports #15659

Merged

rgommers mentioned this pull request Mar 18, 2022

DEV: use Meson for doc CI #15686

Merged

Uh oh!

CI: Run benchmarks #8779

CI: Run benchmarks #8779

Uh oh!

Conversation

larsoner commented Apr 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larsoner commented Apr 27, 2018

Uh oh!

tylerjereddy commented Apr 27, 2018

Uh oh!

Uh oh!

tylerjereddy commented Apr 27, 2018

Uh oh!

larsoner commented Apr 28, 2018

Uh oh!

tylerjereddy commented Apr 28, 2018

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tylerjereddy commented Nov 26, 2018

Uh oh!

larsoner commented Sep 2, 2020

Uh oh!

larsoner commented Sep 3, 2020

Uh oh!

rgommers commented Sep 4, 2020

Uh oh!

rgommers commented Sep 4, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larsoner commented Sep 9, 2020

Uh oh!

larsoner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larsoner commented Sep 11, 2020

Uh oh!

larsoner commented Nov 2, 2020

Uh oh!

rgommers left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

larsoner commented Apr 27, 2018 •

edited

Loading