Tooling for automated stats gathering #115

brandtbucher · 2022-10-13T16:31:37Z

We have a wonderful mechanism (--with-pystats) for quantifying the success of many of our optimizations. However, there is currently quite a bit of friction involved in collecting that data. I think that the situation can be improved without too much difficulty.

My current wishlist regarding pyperformance runs built --with-pystats:

A cron job to run pyperformance with stats turned on, perhaps weekly, and check the results into the ideas repo. This can run on GitHub actions, since we don't actually care about performance stability. It can also be parallelized, and maybe even make use of Pyperformance's --fast option.
- Bonus points: compare the stats with the previous run, and surface any "interesting" regressions (there have been times where hit rates have plummeted without our knowledge before).
A way to run a stats build of pyperformance using a label on any CPython PR, and report the results in a comment. Currently, collecting stats for a PR is a slow process that must be completed locally.
- Bonus points: also run stats for the base commit.
  - Bonus bonus points: compare the stats automatically, surfacing any "interesting" changes.

I also know that @markshannon has also expressed a desire to have pyperformance (or maybe pyperf?) turn stats gathering on and off using 3.12's new sys utilities before and after running each benchmark, so that we're gathering just stats on the benchmarks themselves, and ignoring the external benchmarking machinery.

CC @mdboom

Individual tasks to get there:

Prototype
pyperf only collect stats during benchmark code
Update the summarize_stats.py script to support comparing results
Make a release of pyperf
Upgrade the pyperf requirement in pyperformance
Make a release of pyperformance
Add periodic Github Action to ideas repo
Add PR Github Action to python/cpython

The text was updated successfully, but these errors were encountered:

mdboom · 2022-10-13T16:43:14Z

I think this is a great idea. Unlike timing benchmarks, this is a lot more deterministic (though I suspect not 100% deterministic). I like the idea of stats helping to be a bit of a "tie breaker" when maybe the raw timings are too close to call (only for PR's that are obviously trying to improve the specializer, of course).

I also know that @markshannon has also expressed a desire to have pyperformance (or maybe pyperf?) turn stats gathering on and off using 3.12's new sys utilities before and after running each benchmark, so that we're gathering just stats on the benchmarks themselves, and ignoring the external benchmarking machinery.

Yeah -- I was just working on this part this morning: psf/pyperf#140

This adds support for comparing pystats collected from two different builds. - The `--json-output` can be used to load in a set of raw stats and output a JSON file. - Two of these JSON files can be provided on the next run, and then comparative results between the two are output. The refactor required is basically to: - Separate out the building of table contents from emitting the table - Call these new functions from functions designed for either single results or comparative results Part of the work for: faster-cpython/tools#115

This, in particular, is to get the new pystats support in pyperf as part of: faster-cpython/tools#115

Part of faster-cpython/tools#115

mdboom · 2022-12-01T15:38:38Z

Adding this automation to PRs will require committing more automation to upstream cpython, where there is (understandable) wariness to adding more complexity. As a half-way measure, it turned out to be really easy to add support for running on an arbitrary fork and branch to the automation in the ideas repo:

I think that might be good enough for now, and it at least addresses getting the fiddly bits of collecting the stats correct. The main downsides are it's a little less discoverable, and only someone with write permissions to the ideas repo can kick it off. It is possible to give individuals that permission, though.

mdboom · 2023-02-09T17:57:32Z

Closing as DONE. Diffing isn't automated, but is possible with two sets of .json pystats and the latest Tools/scripts/summarize_stats.py script.

mdboom self-assigned this Oct 14, 2022

mdboom mentioned this issue Oct 27, 2022

pystats github action mdboom/cpython#3

Closed

mdboom mentioned this issue Oct 28, 2022

Support comparing two sets of pystats python/cpython#98816

Merged

mdboom added a commit to mdboom/pyperformance that referenced this issue Nov 4, 2022

Upgrade to pyperf 2.5.0

793307f

This, in particular, is to get the new pystats support in pyperf as part of: faster-cpython/tools#115

mdboom mentioned this issue Nov 4, 2022

Upgrade to pyperf 2.5.0 python/pyperformance#240

Merged

corona10 pushed a commit to python/pyperformance that referenced this issue Nov 6, 2022

Upgrade to pyperf 2.5.0 (#240)

948cfce

This, in particular, is to get the new pystats support in pyperf as part of: faster-cpython/tools#115

mdboom mentioned this issue Nov 22, 2022

Record pystats periodically. faster-cpython/ideas#499

Merged

mdboom added a commit to faster-cpython/ideas that referenced this issue Nov 29, 2022

Record pystats periodically. (#499)

fca044e

Part of faster-cpython/tools#115

mdboom closed this as completed Feb 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tooling for automated stats gathering #115

Tooling for automated stats gathering #115

brandtbucher commented Oct 13, 2022 •

edited by mdboom

Loading

mdboom commented Oct 13, 2022

mdboom commented Dec 1, 2022

mdboom commented Feb 9, 2023

Tooling for automated stats gathering #115

Tooling for automated stats gathering #115

Comments

brandtbucher commented Oct 13, 2022 • edited by mdboom Loading

mdboom commented Oct 13, 2022

mdboom commented Dec 1, 2022

mdboom commented Feb 9, 2023

brandtbucher commented Oct 13, 2022 •

edited by mdboom

Loading