Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Has the volatile declaration in fsum() outlived its usefulness? #100833

Closed
rhettinger opened this issue Jan 7, 2023 · 4 comments
Closed

Has the volatile declaration in fsum() outlived its usefulness? #100833

rhettinger opened this issue Jan 7, 2023 · 4 comments
Assignees
Labels
type-feature A feature request or enhancement

Comments

@rhettinger
Copy link
Contributor

rhettinger commented Jan 7, 2023

This bit of defensive programming is costly. Removing the volatile declaration reduces the running time of fsum() for a hundred floats from 2.26 usec per loop to 1.42 usec per loop.

I'm thinking the x87 issues have mostly faded. If we do need to keep this, can it be wrapped in an #ifdef so that we don't have everyone paying for a problem that almost nobody has?

Linked PRs

@rhettinger rhettinger added the type-feature A feature request or enhancement label Jan 7, 2023
@mdickinson
Copy link
Member

It certainly seems worth attempted removal: I'd say remove the qualifier, run the buildbots, and see what happens.

Incidentally, this has little to do with x87 (the "this prevents double rounding" text in the comment is inaccurate). It's a defence against the compiler ignoring the temporary assignment to hi (so effectively inlining as yr = (x + y) - x) and then re-associating to yr = (x - x) + y. Neither of those things should happen in a conforming compiler.

@mdickinson
Copy link
Member

mdickinson commented Jan 8, 2023

the "this prevents double rounding" text in the comment is inaccurate

For the record, here's an example of how double-rounding can break the contract of 2SUM, volatile qualifiers notwithstanding.

Suppose that x and y are IEEE 754 binary64 floats with values x = 0x1.0000000000001p+63, y = 0x1.ffc0000000001p+9. If the sum x + y were subject to double-rounding, rounding first to the x87 format's 64-bit precision and then to the binary64 format's 53-bit precision (assuming round-ties-to-even in both cases), we'd get 0x1.00000000000018p+63 from the first round and 0x1.0000000000002p+63 from the second.

So our first 2SUM output hi would be 0x1.0000000000002p+63. But now we have a problem: the guarantee that 2SUM makes is that it will return binary64 float values hi and lo such that hi + lo exactly equals x + y; that is, such that lo is exactly (the negation of) the error in approximating x + y by hi. But there's no way to satisfy that guarantee, since x + y - hi is not representable as a binary64 float: it would need a 54-bit significand to be able to represent exactly. So the 2SUM contract is broken. Adding volatile declarations doesn't help with this: the damage is done the moment that x + y is computed in 64-bit precision instead of 53-bit precision.

Supporting Python code:

>>> from fractions import Fraction as F
>>> x = float.fromhex('0x1.0000000000001p+63')
>>> y = float.fromhex('0x1.ffc0000000001p+9')
>>> # x + y is in the binade [2**63, 2**64), so rounding
>>> # to 64-bit precision is equivalent to rounding to the nearest int
>>> hi64 = round(F(x) + F(y))  # round to 64-bit precision
>>> hi = float(hi64)  # then back to 53-bit precision - double rounding!
>>> hi.hex()
'0x1.0000000000002p+63'
>>> lo = F(x) + F(y) - F(hi)  # value that would be needed to satisfy x + y == hi + lo
>>> lo == float(lo)  # but it's not representable as a float
False
>>> lo.numerator.bit_length()
54

@mdickinson
Copy link
Member

volatile qualifier removed in #100845; awaiting results from the buildbots. If the buildbots are happy, I'll update the comments accordingly. If not, we'll need to investigate further.

@mdickinson
Copy link
Member

The buildbots seem happy; I've marked #100845 as ready for review.

mdickinson added a commit that referenced this issue Jan 8, 2023
This PR removes the `volatile` qualifier on various intermediate quantities
in the `math.fsum` implementation, and updates the notes preceding the
algorithm accordingly (as well as fixing some of the exsting notes). See
the linked issue #100833 for discussion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants