Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dtoa: thread safety in --disable-gil builds #111962

Closed
Tracked by #108219
colesbury opened this issue Nov 10, 2023 · 6 comments
Closed
Tracked by #108219

dtoa: thread safety in --disable-gil builds #111962

colesbury opened this issue Nov 10, 2023 · 6 comments
Assignees
Labels
3.13 bugs and security fixes topic-free-threading type-feature A feature request or enhancement

Comments

@colesbury
Copy link
Contributor

colesbury commented Nov 10, 2023

Feature or enhancement

The Python/dtoa.c library is responsible for formatting floating point numbers (double) as strings and for parsing strings into numbers. The shared state is not thread-safe without the GIL:

  1. Balloc and Bfree use a per-interpreter free-list to avoid some allocations of Bigint objects
  2. pow5mult uses a per-interpreter append-only list of Bigint powers of 5

For (1), we can just skip using the freelists in --disable-gil builds. We already have a code path (Py_USING_MEMORY_DEBUGGER) that doesn't use freelists.

#ifndef Py_USING_MEMORY_DEBUGGER

For (2), we can use atomic operations to append to the powers-of-5 linked list in a thread-safe manner. I don't think this needs to be guarded by a Py_NOGIL checks, since each power-of-5 is only ever created once.

For context, here is the modification to Python/dtoa.c in nogil-3.12. Note that it uses a PyMutex for (2), which I think we can avoid.

dragonbox, Ryū, Grisu, Schubfach

In the past 5 or so years, there have been a number of faster float-to-string algorithms with the desirable attributes (correctly rounded, no "garabage" digits, etc.). To my knowledge they are also all thread-safe. "dragonbox" looks the most promising, but porting it to C is a bigger undertaking than making the existing dtoa.c thread-safe.

Linked PRs

@ericvsmith
Copy link
Member

@mdickinson for awareness.

@mdickinson
Copy link
Member

Thanks, @ericvsmith.

@colesbury

In the past 5 or so years, there have been a number of faster float-to-string algorithms [...]

I'm not opposed to moving away from dtoa.c (quite the opposite), but it's important to recognise just how much dtoa.c is doing for us right now - not just float-to-string in all its wonderful variants (shortest string, correctly-rounded fixed precision including the case of negative precision, correctly-rounded fixed number of significant digits) but correctly-rounded string-to-float, too. I like the look of dragonbox, but it only covers one subcase (shortest string) of one direction (float-to-string). We'd need to find substitutes for all the various pieces of dtoa.c.

@mdickinson
Copy link
Member

The suggested mitigations for (1) and (2) sound good to me, FWIW.

@mdickinson
Copy link
Member

mdickinson commented Nov 12, 2023

I guess we should probably open a separate issue if we want to explore the idea of moving away from dtoa.c entirely. Dragonbox looks good for shortest-string float-to-str. Daniel Lemire's fast_double_parser might do in the other direction for correctly-rounded str-to-float, but I haven't looked at it in detail and I don't know how battle-tested it is. The main speed priority would be for numeric strings with 17 or fewer significant decimal digits (since that's the max number of digits that the shortest-string algorithm will produce); it seems fine to me if a slower algorithm is used for larger numbers of significant digits. That leaves support for e-style formatting, f-style formatting, and two-argument round (including the case where the second argument to round is negative). Again it would seem fine to have slower fallback options for e and f-style formatting with stupid numbers of digits, since that's a niche use-case, but it would be a shame to lose Python's correct rounding here. E.g., format(0.1, '.55f') should continue to give '0.1000000000000000055511151231257827021181583404541015625' rather than producing something misleading like 0.1000000000000000055500000000000000000000000000000000000 (which is what some other implementations/languages do).

@mdickinson
Copy link
Member

Daniel Lemire's fast_double_parser might do in the other direction

Scratch that; the README says "We encourage users to adopt fast_float library instead".

@colesbury
Copy link
Contributor Author

@mdickinson, thanks for the context. I'll plan on only making minimal changes to dtoa to make it thread-safe.

colesbury added a commit to colesbury/cpython that referenced this issue Nov 13, 2023
This avoids using the Bigint free-list in `--disable-gil` builds,
and pre-computes the needed powers of 5 during interpreter initialization.
This makes two changes to dtoa
colesbury added a commit to colesbury/cpython that referenced this issue Nov 13, 2023
This avoids using the Bigint free-list in `--disable-gil` builds
and pre-computes the needed powers of 5 during interpreter initialization.
@colesbury colesbury self-assigned this Nov 14, 2023
colesbury added a commit to colesbury/cpython that referenced this issue Nov 20, 2023
This avoids using the Bigint free-list in `--disable-gil` builds
and pre-computes the needed powers of 5 during interpreter initialization.
mdickinson pushed a commit that referenced this issue Dec 7, 2023
This updates `dtoa.c` to avoid using the Bigint free-list in --disable-gil builds and
to pre-computes the needed powers of 5 during interpreter initialization.

* gh-111962: Make dtoa thread-safe in `--disable-gil` builds.

This avoids using the Bigint free-list in `--disable-gil` builds
and pre-computes the needed powers of 5 during interpreter initialization.

* Fix size of cached powers of 5 array.

We need the powers of 5 up to 5**512 because we only jump straight to
underflow when the exponent is less than -512 (or larger than 308).

* Rename Py_NOGIL to Py_GIL_DISABLED

* Changes from review

* Fix assertion placement
aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
…thon#112049)

This updates `dtoa.c` to avoid using the Bigint free-list in --disable-gil builds and
to pre-computes the needed powers of 5 during interpreter initialization.

* pythongh-111962: Make dtoa thread-safe in `--disable-gil` builds.

This avoids using the Bigint free-list in `--disable-gil` builds
and pre-computes the needed powers of 5 during interpreter initialization.

* Fix size of cached powers of 5 array.

We need the powers of 5 up to 5**512 because we only jump straight to
underflow when the exponent is less than -512 (or larger than 308).

* Rename Py_NOGIL to Py_GIL_DISABLED

* Changes from review

* Fix assertion placement
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
…thon#112049)

This updates `dtoa.c` to avoid using the Bigint free-list in --disable-gil builds and
to pre-computes the needed powers of 5 during interpreter initialization.

* pythongh-111962: Make dtoa thread-safe in `--disable-gil` builds.

This avoids using the Bigint free-list in `--disable-gil` builds
and pre-computes the needed powers of 5 during interpreter initialization.

* Fix size of cached powers of 5 array.

We need the powers of 5 up to 5**512 because we only jump straight to
underflow when the exponent is less than -512 (or larger than 308).

* Rename Py_NOGIL to Py_GIL_DISABLED

* Changes from review

* Fix assertion placement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes topic-free-threading type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants