-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-27397: Make email module properly handle invalid-length base64 strings #7583
Conversation
This defines a new InvalidBase64LengthDefect defect, which is returned with the encoded string when attempting to base64-decode a string of invalid length (1 mod 4).
Lib/email/_encoded_words.py
Outdated
raise AssertionError("unexpected binascii.Error") | ||
# This only happens when the encoded string's length is 1 more | ||
# than a multiple of 4, which is invalid. | ||
defects = [errors.InvalidBase64LengthDefect()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This replacement of the defect list bothers me, it makes the code more fragile.
I think we should probably instead drop the 'pad_err' computation, since we're going to figure that out in the try/except, and then switch to computing the (value, defects) tuple and have a single return at the end of the method. That way the PaddingDefect can be added in the else clause of that try/except, and we'll always be appending to the defect list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bitdancer, I've tried to address the code's fragility and opaqueness with some restructuring and improved comments.
Adding missing padding in the first attempt, which uses validate=True
, is required to pass the validation in the case where all of the characters are valid but just padding is missing. This is different than the following attempts, where it is assumed that there are invalid characters, and decoding is attempted first with and then without adding padding only in order to be able to tell whether padding was missing.
When you're done making the requested changes, leave the comment: |
I have made the requested changes; please review again. |
Thanks for making the requested changes! @bitdancer: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that looks good. Consistency in the handling of the defects was what I wanted, and your improvement of the comments about exactly what we are doing and why is great.
Removing "backport to 3.6" until https://bugs.python.org/issue27397#msg319338 is resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't Doc/library/emails.errors.rst be updated, too?
When you're done making the requested changes, leave the comment: |
I have made the requested changes; please review again. |
Thanks for making the requested changes! @bitdancer, @ned-deily: please review the changes made to this pull request. |
@taleinat: Please replace |
Thanks @taleinat for the PR 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7. |
GH-7664 is a backport of this pull request to the 3.7 branch. |
…rings (pythonGH-7583) When attempting to base64-decode a payload of invalid length (1 mod 4), properly recognize and handle it. The given data will be returned as-is, i.e. not decoded, along with a new defect, InvalidBase64LengthDefect. (cherry picked from commit c3f55be) Co-authored-by: Tal Einat <[email protected]>
GH-7665 is a backport of this pull request to the 3.6 branch. |
…rings (pythonGH-7583) When attempting to base64-decode a payload of invalid length (1 mod 4), properly recognize and handle it. The given data will be returned as-is, i.e. not decoded, along with a new defect, InvalidBase64LengthDefect. (cherry picked from commit c3f55be) Co-authored-by: Tal Einat <[email protected]>
…rings (GH-7583) (GH-7664) When attempting to base64-decode a payload of invalid length (1 mod 4), properly recognize and handle it. The given data will be returned as-is, i.e. not decoded, along with a new defect, InvalidBase64LengthDefect. (cherry picked from commit c3f55be) Co-authored-by: Tal Einat <[email protected]>
…rings (GH-7583) (GH-7665) When attempting to base64-decode a payload of invalid length (1 mod 4), properly recognize and handle it. The given data will be returned as-is, i.e. not decoded, along with a new defect, InvalidBase64LengthDefect. (cherry picked from commit c3f55be) Co-authored-by: Tal Einat <[email protected]>
The output of base64 encoding can never have a length of 1 more than a multiple of 4 (not including line breaks and padding).
Attempting to decode base64-encoded email payloads of such length would result in an
AssertionError
. This fix properly detects and handles such cases, introducing a new type of defect to be returned in this case. As decided in the b.p.o. issue's comments, in this case the encoded data is returned as-is.Note: This also includes a change to
decode_b()
in_encoded_words.py
, where instead of trying padding with 0-3 '=' characters, it just tries using zero or two. Using 3 will never help since 2 is the maximum possibly needed. Since extra padding is ignored bybinascii.a2b_base64()
and hence bybase64.b64decode()
, trying with 2 added padding characters will work whether 1 or 2 are needed. I've added a comment to this effect as well as a test.https://bugs.python.org/issue27397