[vtv] Add new extractor #15462

tmsbrg · 2018-01-31T22:22:29Z

Before submitting a pull request make sure you have:

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests
Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

VTV is the national television broadcaster of Vietnam. Their website hosts livestreams and videos from numerous state-owned television channels. They apparently broadcast worldwide. At least I can see the streams here in the Netherlands. This extractor supports audio and video livestreams as well as static videos.

dstftw · 2018-02-08T19:36:08Z

youtube_dl/extractor/vtv.py

+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title', fatal=False)


default=None.

dstftw · 2018-02-08T19:36:30Z

youtube_dl/extractor/vtv.py

+        'info_dict': {
+            'id': '1014',
+            'ext': 'm4a',
+            'title': r're:VOV1 | LiveTV - TV Net .*',


Any unrelated suffixes should be removed.

dstftw · 2018-02-08T19:37:05Z

youtube_dl/extractor/vtv.py

+
+        thumbnail = mediaplayer_div_attributes.get("data-image")
+
+        json_url = mediaplayer_div_attributes.get("data-file")


Read coding conventions on optional and mandatory fields.

dstftw · 2018-02-08T19:37:33Z

youtube_dl/extractor/vtv.py

+        # but you never know in the future
+        for stream in video_streams:
+            formats = self._extract_m3u8_formats(stream.get("url"), video_id, ext=ext, fatal=False)
+            if len(formats) != 0:


not formats.

dstftw · 2018-02-08T19:41:46Z

youtube_dl/extractor/vtv.py

+            if len(formats) != 0:
+                break
+
+        if re.match(r'https?://[^/]*/video/.*', url) is not None:


/video/ in url.

dstftw · 2018-02-08T19:42:01Z

youtube_dl/extractor/vtv.py

+
+        if re.match(r'https?://[^/]*/video/.*', url) is not None:
+            is_live = False
+        elif re.match(r'https?://[^/]*/kenh-truyen-hinh/.*', url) is not None:


/kenh-truyen-hinh/ in url.

dstftw · 2018-02-08T19:42:38Z

youtube_dl/extractor/vtv.py

+
+        # little hack to better support radio streams
+        if title.startswith("VOV"):
+            ext = "m4a"


vcodec of format should be set to 'none'.

dstftw · 2018-02-08T19:43:27Z

youtube_dl/extractor/vtv.py

+class VTVIE(InfoExtractor):
+    _VALID_URL = r'https?://..\.tvnet\.gov\.vn/[^/]*/(?P<id>[0-9]+)/?.*'
+    _TESTS = [{
+        'url': 'http://us.tvnet.gov.vn/kenh-truyen-hinh/1011/vtv1',


Each test should be commented on what it actually tests.

dstftw · 2018-02-08T19:44:20Z

youtube_dl/extractor/vtv.py

+from ..utils import extract_attributes
+
+class VTVIE(InfoExtractor):
+    _VALID_URL = r'https?://..\.tvnet\.gov\.vn/[^/]*/(?P<id>[0-9]+)/?.*'


No ... .* at the end is pointless.

tmsbrg · 2018-02-08T22:29:55Z

Thanks for the review! Hope this commit addresses the issues you found correctly.

dstftw · 2018-02-17T12:49:50Z

Checked the code with flake8

Now actually do that.

dstftw · 2018-02-17T12:50:34Z

youtube_dl/extractor/vtv.py

+from ..utils import extract_attributes
+
+class VTVIE(InfoExtractor):
+    _VALID_URL = r'https?://(au|ca|cz|de|jp|kr|tw|us|vn)\.tvnet\.gov\.vn/[^/]*/(?P<id>[0-9]+)/?'


Don't capture groups you don't use. Use proper regex to match all country codes.

These are the only country codes that are valid at the moment. Do you think it'd be better to expand it to match every country code, or even [a-z][a-z]?

dstftw · 2018-02-17T12:51:06Z

youtube_dl/extractor/vtv.py

+
+from .common import InfoExtractor
+
+import re


Generic imports should go before youtube-dl's.

dstftw · 2018-02-17T12:51:51Z

youtube_dl/extractor/vtv.py

+        mediaplayer_div = self._search_regex(r'(<div[^>]*id="mediaplayer"[^>]*>)', webpage, 'mediaplayer element')
+        mediaplayer_div_attributes = extract_attributes(mediaplayer_div)
+
+        thumbnail = mediaplayer_div_attributes.get("data-image")


Single quotes.

dstftw · 2018-02-17T12:51:59Z

youtube_dl/extractor/vtv.py

+
+        thumbnail = mediaplayer_div_attributes.get("data-image")
+
+        json_url = mediaplayer_div_attributes["data-file"]


dstftw · 2018-02-17T12:52:35Z

youtube_dl/extractor/vtv.py

+            'id': video_id,
+            'title': title,
+            'thumbnail': thumbnail,
+            'formats': formats,


Formats not sorted.

[vtv] Add new extractor

b2247d5

dstftw requested changes Feb 8, 2018

View reviewed changes

dstftw added the pending-fixes label Feb 8, 2018

[VTV] Improve based on code review

c045997

dstftw requested changes Feb 17, 2018

View reviewed changes

dstftw closed this in a572ae6 Jun 11, 2018

dstftw added a commit that referenced this pull request Jul 22, 2018

Credit @tmsbrg for #15462

234a858

cypheron mentioned this pull request Feb 3, 2021

Evaluation / overview of new proposed extractors / sites #28054

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vtv] Add new extractor #15462

[vtv] Add new extractor #15462

tmsbrg commented Jan 31, 2018 •

edited

Loading

dstftw Feb 8, 2018

dstftw Feb 8, 2018

dstftw Feb 8, 2018

dstftw Feb 8, 2018

dstftw Feb 8, 2018

dstftw Feb 8, 2018

dstftw Feb 8, 2018

dstftw Feb 8, 2018

dstftw Feb 8, 2018

tmsbrg commented Feb 8, 2018

dstftw commented Feb 17, 2018

dstftw Feb 17, 2018

tmsbrg Mar 30, 2018 •

edited

Loading

dstftw Feb 17, 2018

dstftw Feb 17, 2018

dstftw Feb 17, 2018

dstftw Feb 17, 2018


		thumbnail = mediaplayer_div_attributes.get("data-image")

		json_url = mediaplayer_div_attributes.get("data-file")


		thumbnail = mediaplayer_div_attributes.get("data-image")

		json_url = mediaplayer_div_attributes["data-file"]

[vtv] Add new extractor #15462

[vtv] Add new extractor #15462

Conversation

tmsbrg commented Jan 31, 2018 • edited Loading

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tmsbrg commented Feb 8, 2018

dstftw commented Feb 17, 2018

Choose a reason for hiding this comment

tmsbrg Mar 30, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tmsbrg commented Jan 31, 2018 •

edited

Loading

tmsbrg Mar 30, 2018 •

edited

Loading