youtube_pic

Youtube video downloader with subtitle

Who doesn’t like to watch a good documentary every now and then. And it’s even better if you do it in a learned language. Fortunately, there are endless opportunities to satisfy this need on YouTube. However, it has a downside. Advertisements. Which brazenly dig into the most interesting parts. To avoid this, the Youtube module was written, which we can now use. In the following, it will be shown how to download the desired documentary from YouTube with the subtitles, which can be copied to a pen drive and plugged into the TV and enjoyed without advertisements even in the event of an internet outage.

Before running the code, it is worth checking that the latest version of the pytube module is installed. In the example below, version 15 is used. After selecting the desired video, the program can be run by copying the video link into the link variable.

An important note: the pytube caption submodule is not up to date with the translation of the youtube xml encoding. The bug can be fixed by rewriting the code found on: github.com

# importing the module 
from pytube import YouTube 
  
# where to save 
SAVE_PATH = r"C:\"
  
# link of the video to be downloaded 
link = "https://youtu.be/BuZFeBkbfUE"
  
try: 
    # object creation using YouTube
    yt = YouTube(link) 
except: 
    print("Connection Error") # to handle exception 
  
# download the video
mp4files = yt.streams.filter(progressive=True, file_extension='mp4').first().download()

# for subtitle download, reassign the link stream
mp4files = yt.streams.filter(progressive=True, file_extension='mp4').first()

# Subtitle chooser
# Which subtitles are available?
print(yt.captions)
choosed_subtitle = input('Please choose the subtitle you need: ')

# Get subtitle
caption = yt.captions.get_by_language_code(choosed_subtitle)

# encode subtitle to srt file
srt = caption.generate_srt_captions()

# write srt into a text file with ".srt" extension
with open((mp4files.default_filename.split('.')[0] + '.srt'), "w") as f:
        # iterating through each element of list srt
    for i in srt:
        # writing each element of srt on a new line
        f.write("{}".format(i))