Skip to content

Conversation

@larsoner
Copy link
Member

@larsoner larsoner commented Apr 27, 2018

As mentioned in #8769, tries adding asv to the CircleCI build

on:

1. All commits to master
2. PRs that touch benchmark/ files
3. PR commits that contain [circle benchmark]

Also fixes a broken URL.

Eventually we probably want to do artifact deployment for master builds. But hopefully someone who knows more about asv could do that part.

First let's see if it actually works in a reasonable amount of time (e.g., under 2 hours).

ram=$(awk -F"[: ]+" '/MemTotal/ {print $2;exit}' /proc/meminfo)
os=$(uname -sr)
printf "{\n\"CircleCI\": {\n\"arch\": \"${arch}\",\n\"cpu\": \"${cpu}\",\n\"machine\": \"CircleCI\",\n\"os\": \"${os}\",\n\"ram\": \"${ram}\"\n},\n\"version\": 1\n}" | tee ~/.asv-machine.json
python -u runtests.py --bench
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be interested to hear what @pv and others think about this -- in the past I think discussions about adding asv stuff directly to CI were not favorable, but perhaps infrastructure has advanced sufficiently now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been doing ~2-hour doc builds on master commits on CircleCI on a repo with a similar master-commit-rate for a year or two now. Assuming the time here is in this ballpark and that the number of benchmark-related PRs stays in some reasonable range, this should be fine.

In the future if we get too many PRs, we can start limiting it to master commits only.

@larsoner
Copy link
Member Author

It got 91% done then timed out on "No output for 10 minutes":

https://circleci.com/gh/scipy/scipy/6735?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

@tylerjereddy do you know if there is some particularly slow benchmark that could be split into multiple benchmarks to avoid this?

@tylerjereddy
Copy link
Contributor

@larsoner What about adding the -q (for "quick") flag to the benchmark arguments in the CI context? It looks like runtests.py builds up bench_args from the items in extra_argv and runs asv continuous.

From the linked docs there, -q is specified to only run the benchmark functions once each, which is not acceptable for proper benchmarking, but is well-suited for executing the code & finding if there are errors in new or modified benchmarks.

@tylerjereddy
Copy link
Contributor

I think you'll need the brackets around the [-q] as well, believe it or not, but we can see what the ci says.

@larsoner
Copy link
Member Author

Okay that only took 40 minutes. I think the doc build is around 30.

The output does not seem too useful:

https://6741-1460385-gh.circle-artifacts.com/0/html-benchmarks/index.html

@tylerjereddy
Copy link
Contributor

Looks like there may be 1 or 2 issues (a few Exceptions are raised) in the text output that I see for the benchmarks in the Circle CI console. Does it make sense that only about 12 benchmarks are performed? I guess I normally use asv directly instead of our wrapper infrastructure for driving asv, but that certainly seems like a small number of benchmarks.

. venv/bin/activate
export SHELL=$(which bash)
if ! git remote -v | grep upstream ; then git remote add upstream git://github.com/scipy/scipy.git; fi
changed=$(git diff --name-only $CIRCLE_BRANCH $(git merge-base $CIRCLE_BRANCH upstream/master));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to recall circleci had some syntax to do exactly this (run only if changs under some path), maybe useful to check its docs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't find an official implementation, just ideas

@tylerjereddy
Copy link
Contributor

NumPy has been doing a quick asv run in their Travis CI for quite some time now.

In particular, they use asv dev which automatically includes i.e., --quick. That's probably just intended as a sanity check for not breaking the benchmark code mechanically.

@rlucas7 rlucas7 added the Benchmarks Running, verifying or documenting benchmarks for SciPy label Dec 29, 2019
@larsoner larsoner force-pushed the asv branch 2 times, most recently from a3a9eca to 53f9eca Compare September 2, 2020 18:52
@larsoner
Copy link
Member Author

larsoner commented Sep 2, 2020

I think you'll need the brackets around the [-q] as well, believe it or not, but we can see what the ci says.

In particular, they use asv dev which automatically includes i.e., --quick. That's probably just intended as a sanity check for not breaking the benchmark code mechanically.

I went with the dev command and --bench [-q] to at least get it to run in some sensible time. Locally with the CircleCI commands I can get it to run, and asv preview opens a website, but there are no results. I wonder if you need two runs or something :(

@larsoner larsoner changed the title WIP: Run benchmarks CI: Run benchmarks Sep 3, 2020
@larsoner larsoner added this to the 1.6.0 milestone Sep 3, 2020
@larsoner
Copy link
Member Author

larsoner commented Sep 3, 2020

Okay it runs and renders! The output is not terribly useful, I can't find any benchmarks:

https://21265-1460385-gh.circle-artifacts.com/0/html-benchmarks/index.html

It would be good if some asv expert could fix this at some point. But this at least:

  1. Clears the CircleCI hurdles to running benchmarks

  2. Runs the -q tests, so makes sure nothing breaks on each commit

  3. Adds some stuff that will allows [ci skip] and [skip ci] and [skip github] to skip running the GitHub checks -- I think it needs to be in master to take effect, though. Locally on my fork it seems to work:

This is now ready for review/merge from my end.

@rgommers
Copy link
Member

rgommers commented Sep 4, 2020

There's a bunch of errors during the asv run (e.g. `NameError: name 'magic_square' is not defined), but CI is green. Would be good to make CI fail in this PR, then have a separate PR to fix the bugs that we can merge straight away, and then rebase this PR onto master.

@rgommers
Copy link
Member

rgommers commented Sep 4, 2020

Running benchmarks in only a couple of minutes in CI is great by the way!

export PYTHONPATH=$PWD/build/testenv/lib/python3.7/site-packages
cd benchmarks
asv machine --machine CircleCI
asv --config asv.conf.json dev -m CircleCI --python=same --bench [-q]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this [-q] regular expression? It has been awhile since I worked with regular expressions, but doesn't this restrict to only test cases containing a q or a - somewhere in the name or parameters?

With --bench [-q] 41 benchmarks are found, but without this argument asv finds 234 benchmarks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see above that this -q was probably intended to cause "quick" operation but it is instead being interpreted as a regular expression corresponding to the --bench argument.

According to the ASV docs asv dev is already equivalent to asv run --quick --show-stderr --python=same, so I don't think -q needs to be specified to get quick operation.

Unfortunately, removing it is going to make running the benchmarks take much longer as many that didn't match the regular expression were being skipped.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see above that this -q was probably intended to cause "quick" operation but it is instead being interpreted as a regular expression corresponding to the --bench argument.

Indeed, I assumed that this was intentional from comments above but I think I misinterpreted them. I agree it's not great to skip that many. I guess we'll see how long it takes to run all of them.

Maybe we'll need to somehow mark some benchmarks as slow so that they aren't run on CircleCI

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#12732 would give us some ideas for how to do this maybe

@larsoner
Copy link
Member Author

larsoner commented Sep 9, 2020

There's a bunch of errors during the asv run (e.g. `NameError: name 'magic_square' is not defined), but CI is green. Would be good to make CI fail in this PR, then have a separate PR to fix the bugs that we can merge straight away, and then rebase this PR onto master.

Almost all benchmarks have the boilerplate code along the lines of:

try:
    from scipy.optimize.tests.test_linprog import lpgen_2d, magic_square
    ...
except ImportError:
    pass

I think this is intentional. I guess that we could have an env var SCIPY_ALLOW_MISSING_BENCHMARKS that defaults to 1 but when set to 0 (which I could do on CircleCI) raises an ImportError, halting the run.

Copy link
Member Author

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Fixed the bugs with importing. It'll now raise an error when SCIPY_ALLOW_BENCH_IMPORT_ERRORS=0, which it is in CircleCI (and some failures are visible, for example here) so it seems to be working.
  • Takes about ~20 minutes to run (which happens in parallel with the doc build, which is ~16 min) after the ~10 minute build time, so hopefully in the same ~30 minute total range as the other elements of the test suite.
  • The artifact is still worthless, but that can be a follow-up PR for someone who knows how to make it useful. But at least the CI mechanics should be sorted out.

Ready for review/merge from my end!

"Note that it can take several hours to run; intermediate output\n"
"can be found under benchmarks/global-bench-results.json\n"
"You can specify functions to benchmark via SCIPY_GLOBAL_BENCH=AMGM,Adjiman,...")
raise NotImplementedError()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pv I had to get rid of this -- it made every of the (hundreds?) of skipped ones emit this message, which made the CircleCI log and local terminal output overly verbose. I refactored it to be code comments above instead

asv machine --machine CircleCI
export SCIPY_GLOBAL_BENCH_NUMTRIALS=1
export SCIPY_ALLOW_BENCH_IMPORT_ERRORS=0
time asv --config asv.conf.json dev -m CircleCI --python=same --bench '^((?!BenchGlobal).)*$'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also explicitly disallow running BenchGlobal benchmark because even with all tests skipped, just having the BenchGlobal.track_all function locally run takes ~1 min of time, even though it's just giving nan for all results. I tried locally various ways of avoiding this, but it must be something in the asv mechanics that makes it take so long.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experiencing the same problem with the new QuadraticAssignment -- it takes 30 seconds just to set up the tests even if none are run. So I'll explicitly skip these, too.

@larsoner
Copy link
Member Author

CIs fixed, all green (sorry they weren't before)

@larsoner
Copy link
Member Author

larsoner commented Nov 2, 2020

Rebased to get things green. It showed that #12775 two months ago added some very slow benchmarks. I made a comment in the code that it should use is_xslow at some point, though it has the same "benchmark collection" slowdown I mention above.

It would be nice to merge this sooner rather than later so that these problems are exposed in PRs directly, though.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and all green, let's give it a go! Thanks @larsoner!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Benchmarks Running, verifying or documenting benchmarks for SciPy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants