Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching release info from PyPI using the new API #8326

Closed
2 tasks done
bm371613 opened this issue Aug 17, 2023 · 12 comments
Closed
2 tasks done

Fetching release info from PyPI using the new API #8326

bm371613 opened this issue Aug 17, 2023 · 12 comments
Labels
kind/feature Feature requests/implementations status/triage This issue needs to be triaged

Comments

@bm371613
Copy link

  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have searched the FAQ and general documentation and believe that my question is not already covered.

Feature Request

Currently, Poetry uses the deprecated PyPI API to get release info from PyPI (code).

This API returns metadata per release, not per artifact. With multiple artifacts in a release having different metadata, this causes missing dependencies, as in issues

PEP-658 proposed a better API and PyPI implemented it in pypi/warehouse#13649

Now if Poetry switches to the new API, above mentioned issues should disappear with the next release of the library (like pytorch).

@bm371613 bm371613 added kind/feature Feature requests/implementations status/triage This issue needs to be triaged labels Aug 17, 2023
@bm371613 bm371613 changed the title Fetching release info from PyPI using the new JSON API Fetching release info from PyPI using the new API Aug 17, 2023
@Secrus
Copy link
Member

Secrus commented Aug 17, 2023

This is already being worked on in #5509

@dimbleby
Copy link
Contributor

This is already being worked on in #5509

but not in a way that will deal with metadata-per-artifact, I assume.

I'd guess it's unlikely that poetry will ever support this in a meaningful way.

@Secrus
Copy link
Member

Secrus commented Aug 17, 2023

Well, yeah, it's unlikely for us to switch to per-artifact metadata, since that would break the reproducible builds (unless we redo our locking mechanism completely).

@bm371613
Copy link
Author

I have only recently discovered some details of Poetry-PyPI-pytorch interaction, so I may not see the full picture here, but my understanding is as follows:

Poetry wants to resolve dependencies for all systems and architectures, preferably without downloading too many wheels just for metadata and without relying on non-standardized mechanisms.

I may naively think that with metadata for all artifacts (including system/arch and dependencies) dependencies can be resolved.

I'd guess it's unlikely that poetry will ever support this in a meaningful way.

Well, yeah, it's unlikely for us to switch to per-artifact metadata, since that would break the reproducible builds (unless we redo our locking mechanism completely).

Do you mean it's

  • an ill-posed problem in some way?
  • out of scope for Poetry?
  • too difficult?
  • too difficult to be implemented soon?

Perhaps it's easy enough for simpler cases that would cover many problematic libraries?

@dimbleby
Copy link
Contributor

I think it's quite infeasible to retro-fit per-artifact metadata onto poetry.

People with better ideas are invited to make pull requests, but don't hold your breath!

@bm371613
Copy link
Author

bm371613 commented Aug 17, 2023

I think it's quite infeasible to retro-fit per-artifact metadata onto poetry.

People with better ideas are invited to make pull requests, but don't hold your breath!

Let's consider what the possible solutions are, sticking to torch as the example library.

The reality we are dealing with is

  • The user executes poetry add torch and hopes import torch works
  • pytorch developers upload many wheels with different metadata for each release
  • PyPI let's us fetch each artifact's metadata via the new-standard API
  • PyPI also let's us fetch metadata of some (~random) artifact pretending to be the release's metadata via a non-standardized API that will probably be gone in the future
  • poetry already has a dependency resolution mechanism that at some point determines metadata for a release (torch version), not artifact

Currently (name, version) -> metadata resolution required by the existing dependency resolution mechanism is implemented by querying the non-standardized API. This means metadata may come from an incompatible artifact and import torch fails.

As I understand, it is infeasible to consider all artifacts and resolve dependencies for all scenarios, ie. replacing (name, version) -> metadata resolution with (name, version) -> list[metadata] and generating a lock file that works on all (system, arch) combinations.

How about leaving the dependency resolution mechanism intact and implementing (name, version) -> metadata so that it works at least on the current (system, arch)? Instead of relying on PyPI to pick ~random metadata via the old API, poetry could fetch metadata for every release, drop those that don't match the current environment and then pick one.

Hopefully this wouldn't be that much work and would fix poetry add torch for most users without breaking anything.

What do you think?

@Secrus
Copy link
Member

Secrus commented Aug 17, 2023

I think it's quite infeasible to retro-fit per-artifact metadata onto poetry.
People with better ideas are invited to make pull requests, but don't hold your breath!

Let's consider what the possible solutions are, sticking to torch as the example library.

The reality we are dealing with is

  • The user executes poetry add torch and hopes import torch works
  • pytorch developers upload many wheels with different metadata for each release
  • PyPI let's us fetch each artifact's metadata via the new-standard API
  • PyPI also let's us fetch metadata of some (~random) artifact pretending to be the release's metadata via a non-standardized API that will probably be gone in the future
  • poetry already has a dependency resolution mechanism that at some point determines metadata for a release (torch version), not artifact

Currently (name, version) -> metadata resolution required by the existing dependency resolution mechanism is implemented by querying the non-standardized API. This means metadata may come from an incompatible artifact and import torch fails.

As I understand, it is infeasible to consider all artifacts and resolve dependencies for all scenarios, ie. replacing (name, version) -> metadata resolution with (name, version) -> list[metadata] and generating a lock file that works on all (system, arch) combinations.

How about leaving the dependency resolution mechanism intact and implementing (name, version) -> metadata so that it works at least on the current (system, arch)? Instead of relying on PyPI to pick ~random metadata via the old API, poetry could fetch metadata for every release, drop those that don't match the current environment and then pick one.

Hopefully this wouldn't be that much work and would fix poetry add torch for most users without breaking anything.

What do you think?

That would break our lock file since it would no longer be cross-platform. Personally, I'd rather see Pytorch implement proper markers on their dependencies instead of having some complex logic in their setup script. If current standards are not enough, PyTorch should work with PyPA to amend standards with new markers that allow them to express their needs properly.

@dimbleby
Copy link
Contributor

if you only care about resolving dependencies for the current platform then pip install foo is a fine solution already.

@bm371613
Copy link
Author

Personally, I'd rather see Pytorch implement proper markers on their dependencies instead of having some complex logic in their setup script.

Do you mean they should have the same dependencies in all artifacts, with markers, as proposed in pytorch/pytorch#105731 (comment) ?

if you only care about resolving dependencies for the current platform then pip install foo is a fine solution already

Sure, pip solves that problem, but poetry has other advantages, like better support for dependency declaration vs locking.

@Secrus
Copy link
Member

Secrus commented Aug 17, 2023

Personally, I'd rather see Pytorch implement proper markers on their dependencies instead of having some complex logic in their setup script.

Do you mean they should have the same dependencies in all artifacts, with markers, as proposed in pytorch/pytorch#105731 (comment) ?

Yes, exactly. Also, for the parts that are more complex, they can also make their own build backend that follows PEP 517 and wraps setuptools or any other tooling they need.

@bm371613
Copy link
Author

@Secrus @dimbleby Thank you for your input. I will reference your comments in related pytorch discussions. I'm closing this issue.

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Feature requests/implementations status/triage This issue needs to be triaged
Projects
None yet
Development

No branches or pull requests

3 participants