Skip to content

Instantly share code, notes, and snippets.

@HKGx
Last active January 6, 2020 19:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save HKGx/ef372d2edbb41eb0432825abe1dce345 to your computer and use it in GitHub Desktop.
Save HKGx/ef372d2edbb41eb0432825abe1dce345 to your computer and use it in GitHub Desktop.
Identify scene

Introduction

I've competed in Google Code-In and I spotted a neat task from CCExtractor.

Task's title was "How can we identify movies based on scenes in them?" and I'm going to answer that question.

First thoughts

The first thing that came to my mind was to split a video into frames using FFmpeg. After splitting we perhaps could test our input against those frames.

But that's a disastrous idea!

Splitting video on each frame is just a waste of our precious disk space. 24-minute long video when split on each frame ended... jamming up my entire drive.

And then we have to deal with comparing the frames. How in the world are you gonna do that?!

The better idea

But maybe we can think of something better? Maybe we can split it once every n frames and try to increase tolerance of our algorithm?

Sure thing! There are algorithms called image hashes that work that way. I found a nicely done article about perceptual hashing. I read it a few times and then I found a Python library that does the work!

Now, how are we going to store those hashes? For likable performance, I presume we can have a table in a database that's holding a hash as a 64-bit primary key and a list of movie names.

It kinda works

I was able to build a small prototype. Its performance isn't the best, because it's just a prototype, BUT IT'S WORKING!

It properly identified that slightly oversaturated image is not very different than the base one.

Other methods

Do they exist? Of course. Developers regularly find new ways to do something!

A somewhat similar to finding animes based on the scene is used in https://trace.moe. They're using a color layout descriptor algorithm for this.

And there is even an ML approach to this problem and it works. During my further investigation, I stumbled upon a research paper that used Deep Learning to extract the features from an image (link).

Stranger approach

Apropos Machine Learning, possibly we can think of another way to accomplish our task? Maybe some more theoretical one?

I'm not an expert on the matter (to be fair, I've never tried ML in my life), but what if we could train our algorithm to just recognize movies. I'm probably totally overcomplicating the idea, but what if? What if we could teach our program to recognize movies and say from which movie it came from?

Yeah, what if...

IMG_DIR = "/home/user/some-dir/frames/"
HASHES_DB = "/home/user/some-dir/some-file.db"
TARGET = "/home/user/some-dir/frames/certain-frame.jpg"
import config
from PIL import Image
import imagehash
from os import listdir
from os.path import isfile, join
import numpy as np
import sqlite3
db = sqlite3.connect(config.HASHES_DB)
c = db.cursor()
# it's text because i'm too lazy to parse it to int
c.executescript("""CREATE TABLE IF NOT EXISTS "Hash" (
"id" INTEGER PRIMARY KEY AUTOINCREMENT,
"hash" TEXT NOT NULL,
"movie" TEXT NOT NULL
)
""")
imgs = [f for f in sorted(listdir(config.IMG_DIR)) if isfile(join(config.IMG_DIR, f)) and f.endswith(".jpg")]
hashes = []
for i in imgs:
"""
For testing purpose you should restrict the loop to go through only few images as it takes a long time to do.
Maybe try to parallelize it in the future? multiprocessing might come handy
"""
img = Image.open(join(config.IMG_DIR, i))
h: np.ndarray = imagehash.phash(img)
print(f"name: {i}")
hashes.append((str(h),))
c.executemany('INSERT INTO Hash(hash, movie) VALUES(?, "DR STONE EPISODE 24")', hashes)
target = imagehash.phash(Image.open(config.TARGET))
# shows us difference between images
for idx, row in enumerate(c.execute("SELECT * from Hash")):
print(f"curr idx: {idx+1}")
print(target - imagehash.hex_to_hash(row[1]))
db.close()
numpy
ImageHash
pillow
@HKGx
Copy link
Author

HKGx commented Jan 5, 2020

proof of work?

base image:

out target:


index 21 has really small change so it's working as it should

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment