Skip to content

Instantly share code, notes, and snippets.

@monik3r
Last active August 17, 2022 02:18
Show Gist options
  • Save monik3r/131e2aaf72853b286f6a039a2ef37fd9 to your computer and use it in GitHub Desktop.
Save monik3r/131e2aaf72853b286f6a039a2ef37fd9 to your computer and use it in GitHub Desktop.
Tubearchivist thumbnail extractor
#!/bin/python
import asyncio, glob, os, subprocess, time
async def getNFO():
while True:
videos = set([os.path.splitext(val)[0] for val in glob.glob('tube/*/*.mp4')])
thumbnails = set([os.path.splitext(val)[0] for val in glob.glob('tube/*/*.png')])
for video in videos.difference(thumbnails):
subprocess.run('ls')
subprocess.run('ffmpeg -i "' + str(video) + '.mp4" -map "0:2" -frames:v 1 -vframes 1 "' + video + '.png"',
check=False, text=True, shell=True)
await asyncio.sleep(60)
def stop():
task.cancel()
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(getNFO())
except asyncio.CancelledError:
pass
@erwin
Copy link

erwin commented Aug 17, 2022

Inside of TubeArchivist, did you set a custom Download Format?

I'm using the TubeArchivist default, so maybe that's why my results are different...

Also, my ffmpeg is:

ffmpeg version 4.4.2-0ubuntu0.21.10.1 Copyright (c) 2000-2021 the FFmpeg developers

Running your script I noticed an error coming back from ffmpeg on most files saying:

Stream map '0:2' matches no streams.
To ignore this, add a trailing '?' to the map.

I also got an error about using both -frames:v 1 and -vframes 1:

Multiple -frames, -aframes, -vframes or -dframes options specified for stream 0,
 only the last option '-frames:v 1' will be used.

So ultimately I ended up removing map "0:2" -frames:v 1 from the ffmpeg arguments.

I also added -hide_banner and -loglevel error to quiet down output from ffmpeg.

So the ffmpeg call that worked for me is:

subprocess.run('ffmpeg -hide_banner -loglevel error -i "' + str(video) + '.mp4" -vframes 1 "' + video + '.png"',

Also, I added a cron job to make sure the process stays running in case it ever crashes.

I prefer to keep all cron jobs in /etc/crontab so that:

  1. I can see all jobs in one location
  2. So that I can run the job with whichever user is appropriate

For example, if TubeArchivist is runing as the user media, inside /etc/crontab I added:

# m   h  dom mon dow   user-name  command
*     *   *   *   *    media      run-one /<some>/<place>/tubenails.py

run-one isn't installed by default, but it's a very handy way to make sure you don't get duplicate copies of the same job.

So now tubenails should stay running forever, but if it ever crashes, cron will restart it a few seconds later.

We should never get two or more instances of tubenails running at the same time.

Restart cron (or send kill -HUP) to reload /etc/crontab

https://manpages.ubuntu.com/manpages/trusty/man1/run-one.1.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment