Skip to content

Instantly share code, notes, and snippets.

@swapnilraj
Last active January 25, 2019 18:25
Show Gist options
  • Save swapnilraj/04d2ddcc88ab1ea38ebefa4cb8d485b7 to your computer and use it in GitHub Desktop.
Save swapnilraj/04d2ddcc88ab1ea38ebefa4cb8d485b7 to your computer and use it in GitHub Desktop.

Description

This change is a part of a bigger module and change relating to detecting music videos based on certain heuristics. This module detects if a video is a music video by checking for the verified artist icon next to the channel name.

YouTube channels that are recognized as music channels get a verified artist icon next to their channel name, this works in a similar way where popular YouTube channels get a verified icon next to their channel name.

verified artist icon example

Advantages

This change will work very well for popular music artists, and additionally all the videos that these channels upload are almost always music videos; so this acts as a very good indicator for these channels.

Disadvantages

Independent Artists

Independent artists usually are not recognized by YouTube and hence do not get this verified artist icon; for these kinds of artists the module would report them as not being artists which is not ideal. Although the way that the common interface would be implemented it would take into account the confidence that the detector has in its prediction, so with other predictors which work better for these independent artists we can have better detection in the future.

False Positives

The predictor would report false positives for YouTube channels like 'KSIOlajidebt' since they get a verified artist icon based on the fact that the channel has some music videos but not all videos uploaded on the channel are music videos, in fact most of the videos are not music related.

Implementation

The idea behind the detector is simple, if a video page has the verified artist icon then the detector reports it as music video page, but because of the false positives mentioned in the disadvantages section the predictor cannot be 100% confident in its prediction. Since we don't have any data collected yet the confidence of the detector is fixed at 0.80 for now and once we have more data we will refactor the confidence.

@JorikSchellekens
Copy link

Could you add a link to the gist describing the bigger change? Will you add implementation details here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment