Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[generic] prefer enclosures over following links in rss feeds #16189

Merged
merged 7 commits into from
Apr 29, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[generic] account for empty strings
  • Loading branch information
bastiandg committed Apr 14, 2018
commit 0aa4e2a2be0c78f25c2a358e9da53ef314868575
2 changes: 1 addition & 1 deletion youtube_dl/extractor/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2019,7 +2019,7 @@ def _extract_rss(self, url, video_id, doc):
if not enclosure_nodes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not mean next_url is obtained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check was changed and an assignment was added, so that the error can't occur anymore.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing changed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a change needed here? If not enclosure_nodes is True it is expected that next_url wasn't obtained yet.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what? If not enclosure_nodes is False it does not mean next_url was obtained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some thought, I think I get what you are saying. If the rss feed has enclosure_nodes, but the url is empty it skips even though it shouldn't. This is fixed now.

next_url = xpath_text(it, 'link', fatal=False)

if next_url is None:
if not next_url:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential reference before assignment error.

continue

entries.append({
Expand Down