Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build Python polars wheels with PGO #9702

Open
messense opened this issue Jul 4, 2023 · 6 comments
Open

Build Python polars wheels with PGO #9702

messense opened this issue Jul 4, 2023 · 6 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@messense
Copy link
Contributor

messense commented Jul 4, 2023

Problem description

It seems that PGO is a success for pydantic-core, I wonder whether it will speedup py-polars.

@messense messense added the enhancement New feature or an improvement of an existing feature label Jul 4, 2023
@ritchie46
Copy link
Member

Yes, I'd be very interested in that. I think including the db-benchmark and the tpch benchmark code in the guided data seem to me to be interesting candidates.

Tests would be easiest. Not any experience with setting it up though, so any help on this is appreciated.

@messense
Copy link
Contributor Author

messense commented Jul 5, 2023

@ritchie46 Is there any easy way to compare two benchmark results? I'd like to confirm that it actually improves performance before putting too much effort.

@ritchie46
Copy link
Member

The easiest start are the TPCH benchmarks: https://github.com/pola-rs/tpch/tree/main/polars_queries

The repo has a Makefile that creates the dataset. Running this provides a timing table and some plot utils. Now that I think of it, that's also good PGO input.

@zamazan4ik
Copy link

I think you also could be interested in my recent benchmark regarding PGO to different kinds of software (including a lot of databases and database libraries (like RocksDB)) - https://github.com/zamazan4ik/awesome-pgo .

@jmakov
Copy link

jmakov commented Aug 19, 2023

From some of the blog posts it looks like +15% can be done. That's the difference a CPU makes from one generation to another.

@FilipAndersson245
Copy link

Any update on this? PGO is a very interesting way to improve performance with the only cost is deployment and build complexity.
I can recommend looking at https://github.com/Kobzol/cargo-pgo for inspiration on how to run it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

5 participants