Creating your own self-hosted YouTube

26 Apr 2024

I love YouTube. You can find free educational content about almost anything. It's how I have learned a ton about home renovation, but YouTube has become riddled with ads and you can easily end up in a death scroll — wasting your time. Furthermore videos gets removed from YouTube all the time. Both creators and the platform can have their reasons for removing videos — so get them while they are available! I am also somewhat of a datahoarder, so this project is right up my alley.

Let's start by looking at how we get information about videos automatically.

YouTube API

YouTube has made it easy to get data from its platform using the API. The documentation is quite good and you can do a bunch of things in various programming languages. My focus will be on Python. I won't give you a finished program, but I'll guide you through most of the way.

First you'll need to get a developer key. You need to do the following:

  1. Log in or create an account on https://developers.google.com/.
  2. When logged in, follow this link https://console.developers.google.com/project and click the “CREATE PROJECT” button.
  3. Follow the menu and make you sure enable YouTube data.

When you are done creating your project, you should be able to obtain an API key. Initiating your connection to the API is done like this:
youtube = googleapiclient.discovery.build(
	'youtube', 'v3', developerKey = MY_DEVELOPER_KEY
)
Where you need to replace MY_DEVELOPER_KEY with your developer key.

Getting information on channels and playlists

I have a list of YouTube channels, that I wish to get the content from, but in order to get their information, I need to know their channel ID. Click on a YouTube video and then click on the channel name just below the video. This will give you an URL that looks similar to this: https://www.youtube.com/channel/UCX6OQ3DkcsbYNE6H8uQQuVA.

The last part of the URL is the channel ID (“UCX6OQ3DkcsbYNE6H8uQQuVA”). We can then get the information like this:
request = youtube.channels().list(
    part = 'snippet',
    id = 'UCX6OQ3DkcsbYNE6H8uQQuVA'
)
response = request.execute()

channel_title = response['items'][0]['snippet']['title']
channel_description = response['items'][0]['snippet']['description']
channel_thumbnail_url = response['items'][0]['snippet']['thumbnails']['default']['url']
YouTube channels can also have playlists, which typically is a list of their videos that belong together in some way (e.g. a project). I want to grab the playlists of the channels I am interested in. You can loop through a channels playlists like this:
request = youtube.playlists().list(
    part = 'snippet',
    channelId = 'UCX6OQ3DkcsbYNE6H8uQQuVA'
)
response = request.execute()

for item in response['items']:
    playlist_title = item['snippet']['title']
    playlist_description = item['snippet']['description']
    playlist_thumbnail_url = item['snippet']['thumbnails']['default']['url']
But I am also interested in certain playlists where I am not necessarily interested in the channel. You can get information about specific playlists like this:
request = youtube.playlists().list(
    part = 'snippet',
    id = PLAYLIST_ID
)
response = request.execute()

playlist_title = response['items'][0]['snippet']['title']
playlist_description = response['items'][0]['snippet']['description']
playlist_thumbnail_url = response['items'][0]['snippet']['thumbnails']['default']['url']
The playlist ID can be found in a similar fashion as the channel ID.

Getting information on YouTube videos

Finally let's grab the videos from all the channels and playlists that we have found. The videos uploaded to a channel can be found like this:
request = youtube.channels().list(
    part = 'contentDetails',
    id = CHANNEL_ID
)
response = request.execute()

# Grab the channels uploads. 
playlistId = response['items'][0]['contentDetails']['relatedPlaylists']['uploads']
request = youtube.playlistItems().list(
    part = 'snippet',
    playlistId = playlistId,
)
response = request.execute()

for video in response['items']:
    video_id = video['snippet']['resourceId']['videoId']
    channel_title = video['snippet']['channelTitle']
	title = video['snippet']['title']
    thumbnail = video['snippet']['thumbnails']['default']['url']
	description = video['snippet']['description']
	published_at = video['snippet']['publishedAt']
For all the playlists you are interested in, you already know the playlist ID, so there you can skip the first part.

Now that we have the video ID's, we can start downloading.

Downloading YouTube videos automatically

YouTube-DL is a brilliant tool for downloading videos from the internet. Even though the name seems rather specific, the tool can be used to download videos from a multitude of websites. The tool is really powerful enabling you do download the videos exactly how you want it with records to the video format, audio format and so on. It even downloads subtitles.


Running YouTube-DL from Python

YouTube-DL is available in the Python package “youtube_dl”. Install it using PIP. I use the package like this:
import youtube_dl
ydl_opts = {
    'format': 'best',
    'writesubtitles': True,
    'subtitlesformat': 'srt',
    'subtitleslangs': ['da', 'en']
}

with youtube_dl.YoutubeDL(ydl_opts) as ydl:
    ydl.download(['https://www.youtube.com/watch?v=' + VIDEO_ID])
The option “format: best” means that YouTube-DL grabs the best format for both video and audio. The rest of the options are for subtitles, where I want danish and english.

Downloading the thumbnails

Since I'll be browsing the videos from a webserver I also want to get the thumbnails. This is rather easy in Python with the “urllib”-package.
import urllib.request
r = urllib.request.urlopen(THUMBNAIL_URL)
with open(IMAGE_LOCATION + VIDEO_ID + '.jpg', 'wb') as f:
    f.write(r.read())
THUMBNAIL_URL is the URL for the thumbnail that we grabbed from the YouTube API. IMAGE_LOCATION is the local folder where you want to save the image and VIDEO_ID is the ID of the video. This is just the way I do it, you can save the images with whatever name you want.

Grabbing images for channels and playlists are essentially the same.

Setting up a cronjob

To download videos periodically, we need to setup a cronjob of some sort. Type “crontab -e” in your shell to edit your cronjobs. I have my scripts in multiple files with quite telling names, and I wish to start running them at 0600 and 1800. So I have added this:
0 6,18 * * * python3 /PATH_TO_MY_SCRIPTS/get_channels.py
5 6,18 * * * python3 /PATH_TO_MY_SCRIPTS/get_playlists.py
10 6,18 * * * python3 /PATH_TO_MY_SCRIPTS/get_videos.py
15 6,18 * * * python3 /PATH_TO_MY_SCRIPTS/download_images.py
30 6,18 * * * python3 /PATH_TO_MY_SCRIPTS/download_videos.py
Replace PATH_TO_MY_SCRIPTS with whatever location you have your script(s) at.

I could probably run my scripts more efficiently, but I have found this to work fine.

Setting up your webserver

Since I am already running a Nginx webserver on my local network, that's what I'll be using.

The YouTube videos gets downloaded to a folder outside of the root Nginx folder, therefore I need to create an alias in the Nginx configuration file (e.g. /etc/nginx/sites-available/default.conf).
location /zfs {
    alias /zfs
}
This alias states that the /zfs folder (location) on the server can be accessed from my webserver with the URL extension /zfs (e.g. 192.168.1.10/zfs). They don't have to match, you can setup it up anyway you want.

Creating your own YouTube website

I won't really go into detail of how you create your own self-hosted YouTube website, because the one I have created is almost too simple. I have a simple Laravel application that simply shows a feed with the newest videos, a channel list and a list of playlists. It does the job and honestly it doesn't have to be more fancy than that.

Wrapping up

I have glossed over some details, but you should be able to get the general idea from this article. The YouTube API documentation is quite easy to understand, when you first get the hang of it and you can do some pretty cool stuff with it.

Let me know in the comment section if you have done something similar, have any tips and tricks, or have any questions. Happy downloading!

You might also enjoy

How to easily web scrape any website with Python

How to easily web scrape any website with Python

Published 2024-05-03

Datahoarding

Notes

Python

Web development

Learn how to easily web scrape any website using Python. I go through the various techniques I use.

Read the post →
Creating your own self-hosted Instagram

Creating your own self-hosted Instagram

Published 2024-04-26

Datahoarding

Python

With the use of Instaloader and Laravel, you can create your own self-hosted Instagram. Learn how to use Instaloader to download content from Instagram.

Read the post →
Removing EXIF data from an image using Python

Removing EXIF data from an image using Python

Published 2024-09-18 — Updated 2024-11-21

Python

EXIF data is information that is embedded within digital images and is automatically generated by digital cameras and smartphones.

Read the post →