-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[vtv] Add new extractor #15462
[vtv] Add new extractor #15462
Conversation
youtube_dl/extractor/vtv.py
Outdated
video_id = self._match_id(url) | ||
webpage = self._download_webpage(url, video_id) | ||
|
||
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title', fatal=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default=None
.
youtube_dl/extractor/vtv.py
Outdated
'info_dict': { | ||
'id': '1014', | ||
'ext': 'm4a', | ||
'title': r're:VOV1 | LiveTV - TV Net .*', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any unrelated suffixes should be removed.
youtube_dl/extractor/vtv.py
Outdated
|
||
thumbnail = mediaplayer_div_attributes.get("data-image") | ||
|
||
json_url = mediaplayer_div_attributes.get("data-file") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read coding conventions on optional and mandatory fields.
youtube_dl/extractor/vtv.py
Outdated
# but you never know in the future | ||
for stream in video_streams: | ||
formats = self._extract_m3u8_formats(stream.get("url"), video_id, ext=ext, fatal=False) | ||
if len(formats) != 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not formats
.
youtube_dl/extractor/vtv.py
Outdated
if len(formats) != 0: | ||
break | ||
|
||
if re.match(r'https?://[^/]*/video/.*', url) is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/video/ in url
.
youtube_dl/extractor/vtv.py
Outdated
|
||
if re.match(r'https?://[^/]*/video/.*', url) is not None: | ||
is_live = False | ||
elif re.match(r'https?://[^/]*/kenh-truyen-hinh/.*', url) is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/kenh-truyen-hinh/ in url
.
youtube_dl/extractor/vtv.py
Outdated
|
||
# little hack to better support radio streams | ||
if title.startswith("VOV"): | ||
ext = "m4a" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vcodec
of format should be set to 'none'
.
youtube_dl/extractor/vtv.py
Outdated
class VTVIE(InfoExtractor): | ||
_VALID_URL = r'https?://..\.tvnet\.gov\.vn/[^/]*/(?P<id>[0-9]+)/?.*' | ||
_TESTS = [{ | ||
'url': 'http://us.tvnet.gov.vn/kenh-truyen-hinh/1011/vtv1', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each test should be commented on what it actually tests.
youtube_dl/extractor/vtv.py
Outdated
from ..utils import extract_attributes | ||
|
||
class VTVIE(InfoExtractor): | ||
_VALID_URL = r'https?://..\.tvnet\.gov\.vn/[^/]*/(?P<id>[0-9]+)/?.*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No ..
. .*
at the end is pointless.
Thanks for the review! Hope this commit addresses the issues you found correctly. |
Now actually do that. |
from ..utils import extract_attributes | ||
|
||
class VTVIE(InfoExtractor): | ||
_VALID_URL = r'https?://(au|ca|cz|de|jp|kr|tw|us|vn)\.tvnet\.gov\.vn/[^/]*/(?P<id>[0-9]+)/?' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't capture groups you don't use. Use proper regex to match all country codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the only country codes that are valid at the moment. Do you think it'd be better to expand it to match every country code, or even [a-z][a-z]?
|
||
from .common import InfoExtractor | ||
|
||
import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generic imports should go before youtube-dl's.
mediaplayer_div = self._search_regex(r'(<div[^>]*id="mediaplayer"[^>]*>)', webpage, 'mediaplayer element') | ||
mediaplayer_div_attributes = extract_attributes(mediaplayer_div) | ||
|
||
thumbnail = mediaplayer_div_attributes.get("data-image") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single quotes.
|
||
thumbnail = mediaplayer_div_attributes.get("data-image") | ||
|
||
json_url = mediaplayer_div_attributes["data-file"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
'id': video_id, | ||
'title': title, | ||
'thumbnail': thumbnail, | ||
'formats': formats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formats not sorted.
Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
VTV is the national television broadcaster of Vietnam. Their website hosts livestreams and videos from numerous state-owned television channels. They apparently broadcast worldwide. At least I can see the streams here in the Netherlands. This extractor supports audio and video livestreams as well as static videos.