Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is is possible to "parametrize" a benchmark? #48

Open
lelit opened this issue Mar 21, 2016 · 9 comments
Open

Is is possible to "parametrize" a benchmark? #48

lelit opened this issue Mar 21, 2016 · 9 comments

Comments

@lelit
Copy link

lelit commented Mar 21, 2016

I want to benchmark different JSON engines serialization/deserialization functions, with different sets of data. More specifically, I'm trying to convert an already existing set of benchmarks to pytest-benchmark.

Here the contenders is a list of tuples (name, serialization_func, deserialization_func):

@pytest.mark.benchmark(group='serialize default_data')
@pytest.mark.parametrize('serializer',
                         [c[1] for c in contenders],
                         ids=[c[0] for c in contenders])
def test_serializer_benchmark(serializer, benchmark):
    benchmark(serializer, default_data)

@pytest.mark.benchmark(group='deserialize default_data')
@pytest.mark.parametrize('serializer,deserializer',
                         [(c[1], c[2]) for c in contenders],
                         ids=[c[0] for c in contenders])
def test_deserialization_benchmark(serializer, deserializer, benchmark):
    data = serializer(default_data)
    benchmark(deserializer, data)

This will produce two distinct benchmarks tables, one for the serialization function and one for its counterpart. I can go down the boring way of repeating that pattern for each dataset...

What I'd like to achieve is to factorize that to something like the following (that does not work):

@pytest.mark.parametrize('name,data', [('default data', default_data)])
def test_gen(name, data):
    @pytest.mark.benchmark(group=name + ': serialize')
    @pytest.mark.parametrize('serializer',
                             [c[1] for c in contenders],
                             ids=[c[0] for c in contenders])
    def serializer_benchmark(serializer, benchmark):
        benchmark(serializer, data)

    @pytest.mark.benchmark(group=name + ': deserialize')
    @pytest.mark.parametrize('serializer,deserializer',
                             [(c[1], c[2]) for c in contenders],
                             ids=[c[0] for c in contenders])
    def deserializer_benchmark(serializer, deserializer, benchmark):
        serialized_data = serializer(data)
        benchmark(deserializer, serialized_data)

    yield serializer_benchmark
    yield deserializer_benchmark

That way I could reuse the very same code to create benchmarks against all other sets of data, without repeating the code, simply adding them to the initial parametrize:

@pytest.mark.parametrize('name,data', [('default data', default_data),
                                       ('array 256 doubles', doubles),
                                       ('array 256 unicode', unicode_strings),
                                      ])
def test_gen(name, data):
...

Is there any trick I'm missing?

@ionelmc
Copy link
Owner

ionelmc commented Mar 21, 2016

@lelit Am I correctly understanding that you want to parametrize (or make up from parametrization) the group name?

@ionelmc
Copy link
Owner

ionelmc commented Mar 21, 2016

The you have two options:

Maybe we could have @RonnyPfannschmidt's idea of @pytest.mark.benchmark(group='{name}: deserialize') ...

@lelit
Copy link
Author

lelit commented Mar 21, 2016

Yes, I think Ronny is on the right path, but I think I'm missing what injects name and deserialize in the benchmark group description... maybe an outer parametrize decorator?

@RonnyPfannschmidt
Copy link

when taking the benchmakr group from a test item, the parameterization is already accessible on the test item

so thats when things can be taken out

@ionelmc
Copy link
Owner

ionelmc commented Mar 21, 2016

An example hook, that you'd put in your conftest.py:

def pytest_benchmark_group_stats(config, benchmarks, group_by):
        result = defaultdict(list)
        for bench in benchmarks:
            result["%s: %s" % (bench.params['name'], bench.name)].append(bench)
        return result.items()

Not sure, you might need the master branch bench.params property ....

@ionelmc
Copy link
Owner

ionelmc commented Mar 21, 2016

Yeah, with 3.0 you'd need to parse out that specific parameter from bench.param, eg:

result["%s: %s" % (bench.param.split('-')[0], bench.name)].append(bench)

@lelit
Copy link
Author

lelit commented Mar 25, 2016

Thank you!

That function fulfilled my needs: I was able to reduce the original benchmarks down to a handful of functions.

@ionelmc
Copy link
Owner

ionelmc commented Apr 9, 2016

It didn't initially occur to me but you can also do this:

@pytest.mark_parametrize('foo', [1,2,3])
def test_perf(benchmark, foo):
   benchmark.group = '%s - perf' % foo
   benchmark(....)

@ionelmc
Copy link
Owner

ionelmc commented Apr 9, 2016

Also, you may set the group from a fixture to reduce boilerplate, eg:

@pytest.fixture(params=[1,2,3])
def foo(benchmark, request):
   benchmark.group = '%s - perf' % request.param
   return request.param

def test_perf(benchmark, foo):
   benchmark(....)

The only constraint is that the fixture needs to be function scoped (as benchmark fixture is).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants