Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to perform more than one benchmark per pytest test #166

Open
ntninja opened this issue Apr 10, 2020 · 6 comments
Open

Make it possible to perform more than one benchmark per pytest test #166

ntninja opened this issue Apr 10, 2020 · 6 comments

Comments

@ntninja
Copy link

ntninja commented Apr 10, 2020

Say, I have a test function like this:

@pytest.mark.benchmark(group="write_100_files_1K_serial")
def test_bench_write_100_files_1K_serial(temp_path, benchmark1, benchmark2):
	benchmark1.name = "trio"
	benchmark1(trio.run, bench_trio_write_100_files_1K_serial, temp_path)
	
	benchmark2.name = "datastore"
	benchmark2(trio.run, bench_fsds_write_100_files_1K_serial, temp_path)
	
	assert benchmark2.stats.stats.median < (2 * benchmark1.stats.stats.median)

Since both of these benchmark calls are I/O bound (or they should be anyway… different story), I cannot compare them to fixed values. Instead, I'd like to compare the relative slow-down/speed-up of my piece of code to some reference code – that is what the test assert does.

Any while the above code actually works fine, it only does so because some private API usage (it does work flawlessly however!):

import pytest
import pytest_benchmark.plugin

@pytest.fixture(scope="function")
def benchmark1(request):
	return pytest_benchmark.plugin.benchmark.__pytest_wrapped__.obj(request)
@pytest.fixture(scope="function")
def benchmark2(request):
	return pytest_benchmark.plugin.benchmark.__pytest_wrapped__.obj(request)

See also pytest-dev/pytest#2703 for the pytest-side limitation of things.
The “official solution” recommended by pytest is to make fixtures factory functions. Would this be something that you would be comfortable with exposing as part of this library?

@ionelmc
Copy link
Owner

ionelmc commented May 10, 2020

Well I guess we could have an make_benchmark or benchmark_setup (pytest-django's style) fixture ...

I still don't get your usecase. You only need this to compare and assert relative results of 2 benchmarks?

@patrick91
Copy link

@ionelmc I might have a use case for this, I'm rewriting an API and I'd like to compare the performance with previous api to make sure the new one is not slower. I'm doing this with fixtures at the moment, but maybe calling the benchmark function twice and check the time might be better :)

@ionelmc
Copy link
Owner

ionelmc commented Nov 2, 2020

@patrick91 perhaps you could use one of the hooks (eg: pytest_benchmark_update_json) to make some assertions on the results?

Or perhaps pytest_benchmark_group_stats if you compare to past data?

I doubt the plugin could have a nicer way to deal with your usecase as there are so many ways of looking and doing things with the data. I mean that's why the plugin has options to output json in the first place.

@sarahbx
Copy link

sarahbx commented Apr 2, 2021

Hi @ionelmc, I have a use case for this. It is a long-running test with multiple stages I would like to individually benchmark. Due to the current behavior, to get the necessary data points, the test must be run multiple times, benchmarking only one stage at a time. This can significantly increase the overall testing time having to teardown and setup each run. My initial thought is the pedantic mode could be expanded to include any additional arguments that may be required to facilitate this functionality. Thoughts?

EDIT... what if target could take a list...
eg:

def test_the_thing(fixture):
  def setup(): ...
  def stage1(args): ...
  def stage2(args): ...
  trigger_external_async_process()  # Call not included in benchmark
  benchmark.pedantic(target=[stage1, stage2], setup=setup, rounds=1, ...)
...

@lpsinger
Copy link

I really love pytest-benchmark, but I am also in a situation where my use case requires multiple benchmarks per test case in order to avoid unreasonable setup/teardown time.

I am benchmarking some software that involves setting up and tearing down the database, and my tests are parametrized by the number of sample rows in the database so that I can measure and plot the scaling of the code and compare it with the expected big-O scaling. The database gets populated with random data, but it is expensive to repeatedly set up and tear down the database. What I would like to do is put the benchmark inside a for-loop that adds more random data to the database on each iteration.

@cafhach
Copy link

cafhach commented Jul 5, 2022

my use case requires multiple benchmarks per test case in order to avoid unreasonable setup/teardown time.

Could you alternatively solve this by reusing a fixture (e.g. module scope)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants