-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-55688: Add note about ending backslashes for raw strings #94768
Conversation
Doc/tutorial/introduction.rst
Outdated
@@ -189,6 +189,29 @@ the first quote:: | |||
>>> print(r'C:\some\name') # note the r before the quote | |||
C:\some\name | |||
|
|||
There is one subtle aspect to raw strings: a raw string may not end in an odd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text of the addition is fine. But I am wondering whether such a subtle point should be part of the introduction.rst
. Perhaps keep only the first line and refer to a location with details?
(not sure what would be a better location)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not strictly a part of this file, see https://docs.python.org/3/reference/lexical_analysis.html:
... even a raw string cannot end in an odd number of backslashes ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text you added in this PR will end up here: https://docs.python.org/3/tutorial/introduction.html#strings right?
It is an informal introduction, so I would write something like:
There is one subtle aspect to raw strings: a raw string may not end in an odd
number of \
characters. For details see String and Bytes literals
But the text added is correct, so it is mostly a matter of style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-reading this, I realize that my comment above doesn't actually address your original concern on whether this is too subtle of a point for the tutorial introduction - IMO, since the section above does talk about escaping quotes as well as raw strings, I think it's somewhat natural to at least mention the case with them combined.
As for how brief a mention, the current phrasing can definitely be made more brief. I think also that just one of the workarounds can be left in, while the other two examples put elsewhere, along with the wording specific to Windows paths. If the section would introduce the problem of raw strings not ending in an odd number of slashes, I think it should also suggest some sort of solution to it, so I would prefer keeping one example in.
As for moving the rest of the information, I don't think the reference is the most appropriate place, maybe instead in some FAQ entry or a howto page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An FAQ entry sounds reasonable. Maybe we do something like There is one subtle aspect to raw strings: a raw string may not end in an odd number of \ characters; see [the FAQ entry] for workarounds
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with eendebakpt that this seems like too subtle a point to spend this much space discussing in the tutorial, so I think we should try to find another home for this.
Maybe an alternate approach would be to add some special casing to give users a more helpful error message in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks broadly good to me.
My favourite workaround for this would actually be to use (implicit) concatenation. So:
r'C:\this\will\work' '\\'
or
r'C:\this\will\work' + '\\'
I think this is probably less error prone than the strip approach (e.g. you probably want to use rstrip
, and what if you have more trailing spaces?), so maybe we swap that workaround for concatenation?
Oh one other thing that is sort of confusing that we could clarify here is that in |
Doc/tutorial/introduction.rst
Outdated
@@ -189,6 +189,16 @@ the first quote:: | |||
>>> print(r'C:\some\name') # note the r before the quote | |||
C:\some\name | |||
|
|||
Note that escaping quotes in raw strings will keep the backslash:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe to be explicit, it could be something like
Note that escaping quotes in raw strings will keep the backslash:: | |
Note that, unlike the non–raw string case, escaping quotes in raw strings will keep the backslash:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the change! This looks good, but let's move it to the FAQ section?
My reasoning is based on the description of raw strings it's not really surprising that the backslash is preserved. It's only really surprising in the context of the end quote situation, where the backslash "escapes" the quote for the tokeniser, but not for the actual value.
That is, if you read:
A raw string ending with an odd number of backslashes will escape the string's quote
you may come away with the impression that raw strings are raw except for quotes or something. I guess a concrete phrasing would be adding the following to the end of the FAQ entry:
Note that while a backslash will "escape" a quote for the purposes
of determining where the raw string ends, there are no escape
sequences that affect the interpretation of the value of the raw string.
That is, the backslash remains present in the value of the raw string::
>>> r'backslash\'preserved'
"backslash\\'preserved"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, there's prior art. Here's the relevant words from Lexical Analysis:
Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example,
r"\""
is a valid string literal consisting of two characters: a backslash and a double quote;r"\"
is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw literal cannot end in a single backslash (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the literal, not as a line continuation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not too certain what the tokenization / determining where the raw string ends means too well, but I've commited the changes (and added a link to the reference) for the time being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good to me cc @JelleZijlstra
Thanks @slateny for the PR, and @hauntsaninja for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11. |
Sorry @slateny and @hauntsaninja, I had trouble checking out the |
GH-100570 is a backport of this pull request to the 3.10 branch. |
…thonGH-94768) (cherry picked from commit b95b1b3) Co-authored-by: Stanley <[email protected]> Co-authored-by: hauntsaninja <[email protected]>
Thanks @slateny for the PR, and @hauntsaninja for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11. |
GH-100571 is a backport of this pull request to the 3.11 branch. |
…thonGH-94768) (cherry picked from commit b95b1b3) Co-authored-by: Stanley <[email protected]> Co-authored-by: hauntsaninja <[email protected]>
(cherry picked from commit b95b1b3) Co-authored-by: Stanley <[email protected]> Co-authored-by: hauntsaninja <[email protected]>
(cherry picked from commit b95b1b3) Co-authored-by: Stanley <[email protected]> Co-authored-by: hauntsaninja <[email protected]>
https://docs.python.org/3/tutorial/introduction.html
Co-authored-by: R. David Murray [email protected]