Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-27397: Make email module properly handle invalid-length base64 strings #7583

Merged
merged 6 commits into from
Jun 12, 2018

Conversation

taleinat
Copy link
Contributor

@taleinat taleinat commented Jun 10, 2018

The output of base64 encoding can never have a length of 1 more than a multiple of 4 (not including line breaks and padding).

Attempting to decode base64-encoded email payloads of such length would result in an AssertionError. This fix properly detects and handles such cases, introducing a new type of defect to be returned in this case. As decided in the b.p.o. issue's comments, in this case the encoded data is returned as-is.

Note: This also includes a change to decode_b() in _encoded_words.py, where instead of trying padding with 0-3 '=' characters, it just tries using zero or two. Using 3 will never help since 2 is the maximum possibly needed. Since extra padding is ignored by binascii.a2b_base64() and hence by base64.b64decode(), trying with 2 added padding characters will work whether 1 or 2 are needed. I've added a comment to this effect as well as a test.

https://bugs.python.org/issue27397

This defines a new InvalidBase64LengthDefect defect, which is returned
with the encoded string when attempting to base64-decode a string
of invalid length (1 mod 4).
raise AssertionError("unexpected binascii.Error")
# This only happens when the encoded string's length is 1 more
# than a multiple of 4, which is invalid.
defects = [errors.InvalidBase64LengthDefect()]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replacement of the defect list bothers me, it makes the code more fragile.

I think we should probably instead drop the 'pad_err' computation, since we're going to figure that out in the try/except, and then switch to computing the (value, defects) tuple and have a single return at the end of the method. That way the PaddingDefect can be added in the else clause of that try/except, and we'll always be appending to the defect list.

Copy link
Contributor Author

@taleinat taleinat Jun 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bitdancer, I've tried to address the code's fragility and opaqueness with some restructuring and improved comments.

Adding missing padding in the first attempt, which uses validate=True, is required to pass the validation in the case where all of the characters are valid but just padding is missing. This is different than the following attempts, where it is assumed that there are invalid characters, and decoding is attempted first with and then without adding padding only in order to be able to tell whether padding was missing.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@taleinat
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@bitdancer: please review the changes made to this pull request.

Copy link
Member

@bitdancer bitdancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that looks good. Consistency in the handling of the defects was what I wanted, and your improvement of the comments about exactly what we are doing and why is great.

@ned-deily
Copy link
Member

Removing "backport to 3.6" until https://bugs.python.org/issue27397#msg319338 is resolved.

Copy link
Member

@ned-deily ned-deily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't Doc/library/emails.errors.rst be updated, too?

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@taleinat
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@bitdancer, @ned-deily: please review the changes made to this pull request.

@taleinat taleinat merged commit c3f55be into python:master Jun 12, 2018
@bedevere-bot
Copy link

@taleinat: Please replace # with GH- in the commit message next time. Thanks!

@miss-islington
Copy link
Contributor

Thanks @taleinat for the PR 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-7664 is a backport of this pull request to the 3.7 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 12, 2018
…rings (pythonGH-7583)

When attempting to base64-decode a payload of invalid length (1 mod 4),
properly recognize and handle it.  The given data will be returned as-is,
i.e. not decoded, along with a new defect, InvalidBase64LengthDefect.
(cherry picked from commit c3f55be)

Co-authored-by: Tal Einat <[email protected]>
@bedevere-bot
Copy link

GH-7665 is a backport of this pull request to the 3.6 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 12, 2018
…rings (pythonGH-7583)

When attempting to base64-decode a payload of invalid length (1 mod 4),
properly recognize and handle it.  The given data will be returned as-is,
i.e. not decoded, along with a new defect, InvalidBase64LengthDefect.
(cherry picked from commit c3f55be)

Co-authored-by: Tal Einat <[email protected]>
taleinat added a commit that referenced this pull request Jun 12, 2018
…rings (GH-7583) (GH-7664)

When attempting to base64-decode a payload of invalid length (1 mod 4),
properly recognize and handle it.  The given data will be returned as-is,
i.e. not decoded, along with a new defect, InvalidBase64LengthDefect.
(cherry picked from commit c3f55be)

Co-authored-by: Tal Einat <[email protected]>
taleinat added a commit that referenced this pull request Jun 12, 2018
…rings (GH-7583) (GH-7665)

When attempting to base64-decode a payload of invalid length (1 mod 4),
properly recognize and handle it.  The given data will be returned as-is,
i.e. not decoded, along with a new defect, InvalidBase64LengthDefect.
(cherry picked from commit c3f55be)

Co-authored-by: Tal Einat <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants