Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[openload] Generic IP v4 & v6 stream URL parsing (fixes #16137) #16250

Merged
merged 2 commits into from
Apr 24, 2018

Conversation

zopieux
Copy link
Contributor

@zopieux zopieux commented Apr 22, 2018


This is a fix for #16137. The previous regexps were doing painful parsing of what is actually an IP network. Thing is, the original author (and Travis CI!) has no IPv6 connectivity, hence forgot to include IPv6 patterns in regexps (hex and : eg. 2a01:cb15::). This breaks URL parsing for people on a IPv6-enabled network.

This PR replaces the two similar regexps with a more generic one, that includes both IPv4 and IPv6.

flake8 reports warnings around my fix, dunno if I am supposed to fix these in the PR, as it is unrelated. Please tell me if that's a requirement.

@dstftw
Copy link
Collaborator

dstftw commented Apr 22, 2018

  1. Do not touch existing regexes. Append new instead.
  2. As clearly stated in the tutorial any source code must be flake8 compliant.

@zopieux zopieux force-pushed the fix-openload-ipv6 branch 2 times, most recently from b3a081b to bbcfa1d Compare April 22, 2018 11:49
@zopieux
Copy link
Contributor Author

zopieux commented Apr 22, 2018

As clearly stated in the tutorial any source code must be flake8 compliant.

My bad, there are no warning. I was using a slightly deprecated version of flake8 (W504 was added 11 days ago, and is ignored by default on latest flake8).

Do not touch existing regexes. Append new instead.

I updated the regex a bit. While I understand you don't want to break the old behavior, please note that the new regexp is a super-set of the two others:

new = r">\s*([\w~-]+?~(?:(?:[a-f0-9:]+)|(?:[0-9\.]+))~[\w~-]+)\s*<"

old = r">\s*([\w-]+~\d{10,}~\d+\.\d+\.0\.0~[\w-]+)\s*<"
# [\w-]+~            is included in [\w~-]+
# \d{10,}            is included in [\w~-]+
# \d+\.\d+\.0\.0     is a 16-bit IPv4 network, included in [0-9\.]+
# [\w-]+             is included in [\w~-]+ 
#                    the rest is the same

old = r">\s*([\w~-]+~\d+\.\d+\.\d+\.\d+~[\w~-]+)"
# \d+\.\d+\.\d+\.\d+ is an IPv4 address, included in [0-9\.]+
#                    the rest is the same

@dstftw, I hope I made a point. I understand your position as a maintainer, but I would advocate against accumulating knowledge debt: these regexps are clearly not needed and confusing for futur contributors.

@dstftw
Copy link
Collaborator

dstftw commented Apr 22, 2018

Again: these regexes are intentional, in decreasing degree of strictness. Do not touch them.

@zopieux
Copy link
Contributor Author

zopieux commented Apr 22, 2018

Fine, updated. This must be the dumbest patch I ever contributed to some open-source project.

'stream URL'))
r'>\s*([\w~-]+~\d+\.\d+\.\d+\.\d+~[\w~-]+)',
# someId~someIPv4orIPv6~someId
r'>\s*([\w~-]+?~(?:(?:[a-f0-9:]+)|(?:[0-9\.]+))~[\w~-]+)\s*<'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v4 is already matched by the previous regex, this regex must only match v6.

@zopieux
Copy link
Contributor Author

zopieux commented Apr 22, 2018

Okay, so now we've got three regexes doing the exact same thing, with both very specific patterns for the middle part and very generic ones (\w) for the other parts.

@dstftw
Copy link
Collaborator

dstftw commented Apr 22, 2018

Nah. You did not even grasp the point.

@zopieux
Copy link
Contributor Author

zopieux commented Apr 22, 2018

Apart from grasping some point, does the latest patch still require changes to meet your requirements?


Now we can also have a constructive discussion about these regexps, but you might want to express your view better than telling contributors they are not capable of understanding whatever point you are trying to make.

@dstftw dstftw merged commit 7603054 into ytdl-org:master Apr 24, 2018
dstftw added a commit that referenced this pull request Apr 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants