Using Python to organize your media files

26 Apr 2024

My GoPro Hero camera generates video files that are named GH000000.MP4, where the 0's are incremented for each video recorded. GH000001.MP4, GH000002.MP4 and so on. The files are in chronological order, but I won't know if the videos are recorded the same day or even the same month. This makes organization a hassle, because I have to manually check each file. To ease organization of my video files I'd love to rename the files with the creation timestamp. It is however a tedious process to check the created timestamp and then renaming the file manually. So let's automate it.

In Linux you can get the created timestamp and a lot of other media information through the CLI tool “mediainfo”. The detailed output looks similar to this:
user@pc:~$ mediainfo /home/user/Videos/GH010644.MP4
General
Complete name                            : /home/user/Videos/GH010644.MP4
Format                                   : MPEG-4
Format profile                           : Base Media / Version 1
Codec ID                                 : mp41 (mp41)
File size                                : 1.08 GiB
Duration                                 : 3 min 25 s
Overall bit rate mode                    : Variable
Overall bit rate                         : 45.3 Mb/s
Encoded date                             : UTC 2020-12-20 17:10:36
Tagged date                              : UTC 2020-12-20 17:10:36

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L4.2
Format settings                          : CABAC / 2 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 2 frames
Format settings, GOP                     : M=1, N=15
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 3 min 25 s
Bit rate mode                            : Variable
Bit rate                                 : 45.0 Mb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 29.970 (30000/1001) FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.724
Stream size                              : 1.08 GiB (99%)
Title                                    : GoPro AVC
Language                                 : English
Encoded date                             : UTC 2020-12-20 17:10:36
Tagged date                              : UTC 2020-12-20 17:10:36
Color range                              : Full
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709
Codec configuration box                  : avcC

Audio
ID                                       : 2
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Codec ID                                 : mp4a-40-2
Duration                                 : 3 min 25 s
Bit rate mode                            : Constant
Bit rate                                 : 192 kb/s
Nominal bit rate                         : 48.0 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 4.63 MiB (0%)
Title                                    : GoPro AAC
Language                                 : English
Encoded date                             : UTC 2020-12-20 17:10:36
Tagged date                              : UTC 2020-12-20 17:10:36

Other #1
ID                                       : 3
Type                                     : Time code
Format                                   : QuickTime TC
Duration                                 : 3 min 25 s
Bit rate mode                            : Constant
Frame rate                               : 29.970 (30000/1001) FPS
Time code of first frame                 : 17:45:05:04
Time code, striped                       : Yes
Title                                    : GoPro TCD
Language                                 : English
Encoded date                             : UTC 2020-12-20 17:10:36
Tagged date                              : UTC 2020-12-20 17:10:36

Other #2
Type                                     : meta
Duration                                 : 3 min 25 s
Bit rate mode                            : Variable

Other #3
Type                                     : meta
Bit rate mode                            : Variable
mdhd_Duration                            : 205272

Renaming media files in Python based on media created timestamp

Since I am interested in renaming multiple files, I will be using “glob” to grab all files with the specified naming convention. Like this:
import glob
for filename in glob.glob('GH*.MP4'):
	print(filename)
The media created timestamp can then be found by running mediainfo and grabbing the output. This can be done using the Python package “subprocess”, where the method check_output grabs the output. The output is however in bytes, so we need to decode it in order to further process it. The code looks like this:
output = subprocess.check_output(['mediainfo', filename])
output = output.decode()
The output can then be split into each line and iterated through like this:
lines = output.split('\n')
for line in lines:
	# Do something.
The last thing to do is to find the line with the creation date, which starts with “Tagged date” and then extract the timestamp from the line. When we have the timestamp, we can use os.rename() to rename the file. Putting it all together, it looks like this:
import glob
import os
import subprocess

for filename in glob.glob('GH*.MP4'):
	output = subprocess.check_output(['mediainfo', filename])
	output = output.decode()
	lines = output.split('\n')

	taggedDate = '
	for line in lines:
		if line.startswith('Tagged date'):
			taggedDate = line
			break

	taggedDate = taggedDate.replace('Tagged date', ')
	taggedDate = taggedDate.replace(': UTC ', ')
	taggedDate = taggedDate.replace(':', '.')
	taggedDate = taggedDate.strip()
	print(filename, taggedDate)
	os.rename(filename, taggedDate + '.MP4')
Mediainfo gives a lot of information that a more creative mind probably could use for further organization. I'd love to hear your thoughts and ideas on how you do it. Happy organizing!

You might also enjoy

Removing EXIF data from an image using Python

Removing EXIF data from an image using Python

Published 2024-09-18

Python

EXIF (Exchangeable Image File Format) data is information that is embedded within digital images and is automatically generated by digital cameras and smartphones.

Read the post →
Quick and easy image recognition with 9 lines of code in Python

Quick and easy image recognition with 9 lines of code in Python

Published 2024-05-05 — Updated 2024-07-28

Machine Learning

Python

Need a quick and easy image recognition solution in Python? Learn how to create one in 9 lines of code.

Read the post →
How to easily web scrape any website with Python

How to easily web scrape any website with Python

Published 2024-05-03

Datahoarding

Notes

Python

Web development

Learn how to easily web scrape any website using Python. I go through the various techniques I use.

Read the post →