Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline expression brackets with format specifiers don't work in f-strings #110259

Closed
tusharsadhwani opened this issue Oct 3, 2023 · 26 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-parser type-bug An unexpected behavior, bug, or error

Comments

@tusharsadhwani
Copy link
Contributor

tusharsadhwani commented Oct 3, 2023

Bug report

Bug description:

In accordance with PEP 701, the following code works:

>>> x = 1
>>> f"___{
...     x
... }___"
'___1___'

>>> f"___{(
...     x
... )}___"
'___1___'

But the following fails:

f"__{
    x:d
}__"

This gives:

  File "<stdin>", line 1
    x:d
SyntaxError: unterminated f-string literal (detected at line 2)

Is this intended behaviour? This is not clarified in the PEP.


Similarly,

f"""__{
    x:d
}__"""

Gives:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Invalid format specifier 'd
' for object of type 'int'

CPython versions tested on:

3.12

Operating systems tested on:

macOS

Linked PRs

@tusharsadhwani tusharsadhwani added the type-bug An unexpected behavior, bug, or error label Oct 3, 2023
@AlexWaygood AlexWaygood added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-parser labels Oct 3, 2023
@pablogsal
Copy link
Member

Thanks for the report! I have opened #110271 to fix it

@ericvsmith
Copy link
Member

ericvsmith commented Oct 3, 2023

I'd be careful on that last one. If x is a datetime.datetime, it's a valid format spec.

>>> x=datetime.datetime.now()
>>> f"""__{
...    x:d
... }__"""
'__d\n__'

@ericvsmith
Copy link
Member

In case that's not clear: format specs for datetime.datetime can contain any characters, and are passed on to strftime. Perhaps more usefully:

>>> f"""__{
...     x:%Y
... %m
... %d
... }__"""
'__2023\n10\n03\n__'

@pablogsal
Copy link
Member

Hummmm, then what you are saying is that we should not include the newline in the single quoted case but we should do it in the triple quoted case?

@pablogsal
Copy link
Member

Ah, indeed, this is the case because the triple quote case works like that in 3.11:

Python 3.11.1 (main, Jan  3 2023, 20:00:19) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> f"""__{
...    datetime.datetime.now():%Y
...    %m
...    %d
... }__"""
'__2023\n   10\n   04\n__'

@lysnikolaou
Copy link
Contributor

Hummmm, then what you are saying is that we should not include the newline in the single quoted case but we should do it in the triple quoted case?

I don't understand why this is the case. Shouldn't these both lead to the same AST?

Multi-line f-string:

f"""__{
    x:%Y
%m
%d
}__"""

Single-quoted f-string:

f"__{
    x:%Y
%m
%d
}__"

@ericvsmith
Copy link
Member

I don't think it's necessarily an issue. I was just pointing out that the error shown in the original report

f"""__{
    x:d
}__"""

(a ValueError from the bad format spec "d\n" for int) is different from the error in

f"__{
    x:d
}__"

(a SyntaxError). In any event, the format spec should be allowed to have newlines, including trailing newlines.

@tusharsadhwani
Copy link
Contributor Author

@ericvsmith the triple quoted issue is indeed different from the single quoted case.

But, is that a bug?

f"""{
  5:.2f
}"""

doesn't work on 3.11 and below as well, but I feel like it should.

@ericvsmith
Copy link
Member

@tusharsadhwani Maybe I could be talked into changing it, but the change would have to be in float.__format__ to ignore trailing newlines (or maybe all trailing whitespace?). So, it's unrelated to the original SyntaxError reported here. And if we do change float.__format__, the same change should be made for int, str, complex, and Decimal. I think (but won't swear to it} that those are the only built-in types that handle the format spec "mini-language" defined in https://docs.python.org/3/library/string.html#format-specification-mini-language. For types implemented in C that use the mini-language, there is a common function for parsing format specs, so they'd all pick it up "for free". I don't recall if Decimal uses it or if it has its own function for that.

This change cannot be in the parser, because whether or not whitespace in the format spec is an error or is acceptable is up to the type being formatted. As things stand now, datetime.datetime.__format__ allows the whitespace, and float.__format__ and others do not.

@tusharsadhwani
Copy link
Contributor Author

Agreed.

The change for single quotes should still go through IMO, because the AST should be identical for the formatted brackets even if the fstring gets split on multiple lines. For triple quotes the ASTs must be different because of backwards compatibility reasons.

@pablogsal
Copy link
Member

There is still ambiguity on how to treat the single quote case. The most natural way for the tokenizer is to stop the format string on the first \n because the string chunks of single quote f-strings cannot spawn more than one line. So this means that this is valid:

>>> f"__{
... 1
... +
... 1:d
...
...
... }__"

but this is not:

>>> f"__{
... 1
... +
... 1:a
... b
...
... }__"

The other option is allowing multi-line format specs but that may be a bit more annoying to implement. @ericvsmith @lysnikolaou what do you think?

@pablogsal
Copy link
Member

pablogsal commented Oct 5, 2023

PR #110271 implements single-line format specifiers for single quoted f-strings, but we can change it if we think multi-line specs should be allowed.

@tusharsadhwani
Copy link
Contributor Author

It seems like an arbitrary limitation on code formatters: splitting any f-string on the expression brackets will be valid, unless it's the specific case of a single quoted fstring with a format specifier.

@pablogsal
Copy link
Member

Not really, you can still be able to split the f-string in the brackets, but not the format specifier. This is fine because if you split the format specifier you are changing the f-string anyway.

To clarify, you can change:

f"__{1+1:xxx}__"

to

f"__{
1
+
1:xxx


}__"

if you want, but you cannot split the xxx

@tusharsadhwani
Copy link
Contributor Author

Makes sense, the linked PR seems good to me

@pablogsal
Copy link
Member

pablogsal commented Oct 5, 2023

So basically, in other words, if single quoted f-strings format specifiers cannot spawn multiple lines themselves.

@ericvsmith
Copy link
Member

ericvsmith commented Oct 5, 2023

@pablogsal So I think you're saying that for single quoted f-strings, there can't be a newline in the format spec. Although it looks like that's a change, I think it's okay.

ETA: Our messages crossed. I think we're in agreement.

@lysnikolaou
Copy link
Contributor

I think that's okay as well. Tokenizing multi-line format specifiers in single-quoted f-strings would be a bit cumbersome, so let's go with that.

@pablogsal
Copy link
Member

ETA: Our messages crossed. I think we're in agreement.

Sorry, I am also on a plane, so the timing of my messages may be a bit inconsistent :P

@pablogsal
Copy link
Member

Update: i tried to check how annoying would be to support the other option (support multi-line format specifiers) and turns out that is quite easy to support thanks to how we have structured this (yay!). So only if we want we can support that.

It would look like this, for clarification:


code = """\
f"__{
1
+
1:a
b
c

}__"
"""

print(ast.dump(ast.parse(code), indent=4))

which prints:

Module(
    body=[
        Expr(
            value=JoinedStr(
                values=[
                    Constant(value='__'),
                    FormattedValue(
                        value=BinOp(
                            left=Constant(value=1),
                            op=Add(),
                            right=Constant(value=1)),
                        conversion=-1,
                        format_spec=JoinedStr(
                            values=[
                                Constant(value='a\nb\nc\n\n')])),
                    Constant(value='__')]))],
    type_ignores=[])

Sorry for the back and forth @lysnikolaou @ericvsmith. I think we are free to choose whatever we prefer.

@pablogsal
Copy link
Member

Notice that this has an important difference that I think it makes is less appealing: extra trailing newlines are included in the format specifier, which I think it will make formatters unhappy (@tusharsadhwani, please confirm)

@tusharsadhwani
Copy link
Contributor Author

tusharsadhwani commented Oct 5, 2023

Can confirm, second option will restrict code formatting, pretty much making fstrings with format specs unformattable.

But, fstrings don't get formatted by black right now as-is and there's no confirmation on my end that that's desired.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 5, 2023
…specs (pythonGH-110271)

(cherry picked from commit cc389ef)

Co-authored-by: Pablo Galindo Salgado <[email protected]>
Signed-off-by: Pablo Galindo <[email protected]>
@ericvsmith
Copy link
Member

@tusharsadhwani black can't reformat the format specs without breaking datetime.datetime (for example).

@pablogsal I don't think it matters all that much. I can't imagine that there are many single quote multi-line format specs, although they no doubt exist.

@pablogsal
Copy link
Member

Ok, I ended implementing the single-line restruction for format specs in single-quoted f-strings. Thanks everyone for your help, you rock 🚀

pablogsal added a commit that referenced this issue Oct 5, 2023
… specs (GH-110271) (#110396)

gh-110259: Fix f-strings with multiline expressions and format specs (GH-110271)
(cherry picked from commit cc389ef)

Signed-off-by: Pablo Galindo <[email protected]>
Co-authored-by: Pablo Galindo Salgado <[email protected]>
@tusharsadhwani
Copy link
Contributor Author

@ericvsmith wasn't single quote multiline format spec invalid syntax until now?

@ericvsmith
Copy link
Member

@tusharsadhwani : I thought somewhere on this thread someone said it worked in 3.11, but I guess I misread that.

dhruvmanila added a commit to astral-sh/ruff that referenced this issue Oct 5, 2023
## Summary

Reported at python/cpython#110259

## Test Plan

Add test cases for the fix and update the snapshots
konstin pushed a commit to astral-sh/ruff that referenced this issue Oct 11, 2023
## Summary

Reported at python/cpython#110259

## Test Plan

Add test cases for the fix and update the snapshots
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-parser type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants